AI1jpeg

By Pavel Komarovsky
I write interesting things about finance at t.me/RationalAnswer
The text is posted with the permission of the author.
The original material is here.


In this article, we will tell you about the most important features implemented in ChatGPT over the last six months (the most powerful neural network in the world). Additionally, we will discuss the vision of the future shared by Sam Altman at the OpenAI conference held on November 7th. Spoiler alert: they want to create "Smith agents" that can independently interact with the world!

AI2
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 106

Sam Altman here be like, “Well, we’re basically going to train these agents and release them into the network – as for what happens next, just watch the Wachowskis’ movie, I won’t spoil it for you…”


This article appears to have two authors, but in reality, almost the entire text was written by Igor Kotenkov (the author of the Sioloshnaya channel on machine learning, space, and technology). One could say that Igor was responsible for technical accuracy and expertise in artificial intelligence. After that, Pavel Komarovskiy (the author of the RationalAnswer channel on a rational approach to life and finances) piled on top with some quirky memes. In short, no time to explain, let's go!

Since the release of our previous article, “GPT-4: What the New Neural Network Learned and Why It’s a Bit Creepy,” a lot of interesting things have happened. There have been updates to existing products as well as the release of entirely new ones.

Developers are racing to create new AI startups, companies are attracting billions of dollars in investments, and people are getting lost in the news, struggling to understand what’s happening in the world of artificial intelligence. In short, we decided it’s time to provide an overview of the key changes that have occurred over the past six months and share the latest announcements from the just-concluded OpenAI DevDay 2023 conference. Even if you’ve been closely following the development of ChatGPT, we’re confident that you’ll find it informative and interesting!


AI3
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 107


Note on ChatGPT/ChatGPT-3.5/GPT-4 to avoid confusion (read only for sticklers and pedants):

In general, all these terms roughly mean the same thing. But let’s clarify the terminology we use:

  • LLM, Large Language Model — a large language model. Basically, any text neural network, with ChatGPT being a prominent representative.
    GPT-3.5 — the basic text model (LLM) from OpenAI, which existed for a long time as a service for developers. In terms of capabilities, it’s similar to the version that went viral in December 2022, known as ChatGPT.
    .
  • ChatGPT, also known as ChatGPT-3.5 — the first version of a conversational AI assistant based on GPT-3.5. Dialogue format was added, and specific training was conducted for this format.
    .
  • GPT-4 or ChatGPT-4 — an advanced version of the model from OpenAI. It’s larger, trained for a longer period, making it smarter and capable of understanding more languages. It was added to the ChatGPT website immediately, so effectively, since March 2023, ChatGPT can denote GPT-4: the terms are used interchangeably. A separate version of GPT-4 without the chat format has never been shown to the public.
    .
  • In essence, ChatGPT refers to the conversational LLM in general. In almost all contexts, it can be perceived as GPT-4, as there is no point in discussing older and less capable models. So yes, ChatGPT = GPT-4. 🙂

AI4
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 108


If you haven’t read our two previous longreads explaining in simple language the principles of how text neural network technology works, now is the time to catch up (it will help you understand the current article):

ChatGPT is sweeping the planet

First, let’s say a few words about how ChatGPT has evolved in terms of popularity and penetration into the masses. (By the way, a survey among the authors of this article showed that 50% of them regularly use this neural network!)

Sam Altman (CEO of OpenAI) at the OpenAI DevDay 2023 conference revealed the following statistics: the Weekly Active Users (WAU) of ChatGPT exceed one hundred million people. Interestingly, the weekly metric is not the most commonly used; usually, people talk about Daily Active Users (DAU) or Monthly Active Users (MAU).

We remember that at the beginning of 2023, more than 100 million people were already using the product monthly. Let’s cautiously assume that this figure hasn’t dramatically increased, so it was decided to slightly change the presentation. According to internet traffic calculations, MAU is approximately 180 million people, which is still very impressive for a year-old product!

If you’re a finance enthusiast, the following should catch your interest: 92% of companies on the Fortune 500 list (the largest U.S. companies by revenue) are already using OpenAI products. In short, businesses are actively trying to figure out how to make the most of this technological singularity to earn more profits!


AI5
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 109

However, as they say, there’s a catch: if you train a neural network on a database of chats where programmers communicate about work, it quickly starts responding to any queries in a style like, “Oh, I’m feeling burned out, I could really go for a fresh smoothie right now…”


And most importantly, all of this has been achieved entirely without any paid advertising—just the product, with word of mouth spreading the news! (Disclaimer: this material has not been paid for by OpenAI).

Okay, now let’s briefly go through the key milestones in the development of OpenAI’s brainchild that we’ve observed since the release of the flagship GPT-4 model in March 2023.

Spring 2023: Tools and plugins for ChatGPT, or how to add “handles” to a neural network

Many users have long and rightfully criticized the “limited” capabilities of language models since they don’t have access to the internet—meaning they cannot find and use fresh information to form responses to queries.

All the knowledge they possess is dictated by the training data the model has seen. Moreover, in their original form, Large Language Models (LLMs) are not particularly strong in mathematics, performing only approximate calculations (though sometimes they may be accurate).

Recognizing this limitation, OpenAI adapted the concept of “tools.” Just as a person uses a calculator for complex calculations instead of mental estimation, ChatGPT can turn to an external service to perform a specific action—even if it’s much more complicated than adding two and two. Shortly after the release of the GPT-4 model, “plugins” emerged, with the main ones being access to the Bing search engine (oh, no jokes about the model “googling”!) and a code interpreter.

The first helps update knowledge on various topics by passing the results of the search engine’s work on a specific text query (which the model itself formulates) to GPT. The second determines when the model wants to run a Python program, performs all the actions, and displays the result.


AI6
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 110

  • Hey, ChatGPT, what interesting happened in the world this week?
  • Over the past week, several significant events have occurred in the world:
    – A $6 million bank robbery took place in Costa Rica.
    – An aide to Boris Johnson said he suggested a COVID injection to prove a point.
    – The robot killed the worker who was examining it.
    -The Prime Minister of Portugal resigned after the arrest of his chief of cabinet.
    -Four people are accused of stealing a gold toilet worth $6 million.

An example of using a search engine by a model. The fifth news actually appeared on the day the article was written - so the material is fresh!

The most curious readers might wonder: how does this actually work? How do you “connect” the real world to a language model that can do nothing but read and write text? To answer this question, we need to recall two facts that we discussed in the first article, “How ChatGPT Works“:

  • Modern language models were trained to follow instructions.
    .
  • Modern language models have a good understanding of programming concepts and can write code reasonably well. (Of course, they’ve read the entire internet! So many heated discussions on developer forums, and documentation has been helpful too, of course.)

Based on these facts, the following idea emerges: let’s write an instruction that shows the model how it can interact with, say, a calculator using code. The external program will simply “read” the model’s output in words and perform the corresponding actions.


AI7
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 111

You are a middleware designed to translate text commands into commands for a calculator (an external tool).
In order to give a command to the calculator, you must write: “Send to the calculator: (…)”, and in brackets indicate:
1. The first argument of the operation as a string.
2. The second argument of the operation as a string.
3. Mathematical operation as a string.
For example, “three by two” should become Pass to calculator: (“3”, “2”, “*”)

Write “OK” if you understand the command.

OK


For example, we instruct ChatGPT on the format of the response we expect. The only way for it to satisfy the user is to follow our instructions and do exactly what we asked (even if we presented the instructions in a peculiar order).

It sounds incredibly simple, but it works even for complex plugins! It may be hard to believe, but this logic is exactly how a browser is connected (when the text on the screen is translated into plain text, and the model decides where to “click”). For all the details about training the model to surf the web, you can read Igor’s article “ChatGPT as a Search Tool.

Another one of the most useful and popular tools available to the model is the Wolfram Alpha math engine, familiar to every tech-savvy student (humanities folks, you can relax for now). Now, any complex calculations are no obstacle for LLM!

Research shows that GPT-4 can even handle the management of an automated chemical laboratory and carry out the synthesis of substances of varying utility, but that’s a different story.


AI8
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 112

GPT-4 was connected to a tube management system (depicted in the top left). It was given simple tasks described in natural language to construct specific shapes from reagents. The model successfully passed the tests.


The only problem with tools (plugins) is that the model can get lost if there are too many of them. It’s not always clear in what sequence to use them and which one to choose specifically. The model’s skill is more akin to “good” rather than “excellent.”

That’s why they’ve now been organized into different chats: in one, you can surf the internet, in another, you can program, and in a third, you can write a term paper with Wolfram (just don’t tell your professor what you’re up to). But over time, the model has improved, and now it can do everything at once, without compromises!

Autumn 2023: Text and image model Dall-E 3, or a quest to generate the perfect cheburek

A separate product that OpenAI recently introduced at the end of September is the generative neural network DALL-E 3. Like its first and second-generation predecessors, it generates images based on input prompts. However, most similar neural networks have a rigid limitation: the longer the prompt (input text query) and the more details it contains, the less the generated image corresponds to the description.

Therefore, prompts often consist of just 1-2 sentences (sometimes even a couple of words), and most of the details are left to the model’s interpretation: it will depict the object as it envisions it. While the tool can be useful for artists/designers, it doesn’t fully meet their needs, as it’s challenging to achieve something that entirely matches the artist’s vision and intended composition.


AI9
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 113

For example, take the painting “Théâtre D’opéra Spatial,” which won the Colorado State competition in 2022. The artwork outperformed others created by human artists, but it required over 600 prompts to the MidJourney model to bring it to life!


OpenAI has taken a huge leap forward here: now, DALL-E 3 understands giant prompts and creates images that precisely match the given text. Let’s take a look at an example from the product’s landing page:


AI10
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 114

Of course, the best example is selected for advertising on the official website, and such intricate generations may not happen every time. However, based on initial subjective tests and online reviews, the attention to detail by this new neural network is still impressive.


The reason DALL-E 3 is featured on this page—although it seems unrelated to ChatGPT and large language models—is rooted in the principle of its operation. DALL-E 3 was developed from the very beginning based on ChatGPT, as this language model generates detailed and effective prompts for DALL-E 3 (based on your “improvised” requests). Just briefly tell ChatGPT what you want to see, even in two words. It will rewrite the prompt, enrich it with details, and only then pass it to DALL-E 3. The integration works exactly like the “plugins” idea described earlier!

AI literally takes on part of the prompt engineering work, replacing the lazy human while also suggesting new ideas for images. You write “cheburek,” and you get (we apologize in advance to anyone who is hungry right now!)…


AI11
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 115

Generated prompt: “A freshly made cheburek on a wooden cutting board, half-cut to reveal the juicy meat filling inside. The dough is golden-brown and crispy, with steam rising from the filling. The background is a rustic kitchen setting…”


What’s more interesting is how this model was trained. We don’t have all the training details, but OpenAI shared the most crucial differences. As far as we know, this is the first time a model of this scale has been trained on synthetic data rather than human-created data.

You heard it right—95% of the image-text pairs (the data the model is trained on) were generated by GPT-4-Vision, announced in the spring. The model looked at images from the internet and wrote several long descriptions, repeating this process billions of times. That’s how models started helping train other models, and there will be no stops on the path to singularity!

Fall 2023: AI assistant from the world of science fiction

Remember Siri, that virtual assistant? Right after its debut, it seemed like we were on the brink of a world filled with super-smart and cool robo-assistants that understood us effortlessly and could do a thousand things. However, over the more than decade-long history of Apple’s product development, it feels like there haven’t been any mind-blowing updates. Siri, or “botina” as it might be called now, still fumbles, confusing a call to “my mom” with “my grandma”…

Meanwhile, in September, an update for the mobile app of ChatGPT was released, allowing it to see, hear, and speak. Now, the most powerful neural network of our time has convenient communication interfaces with you. And most importantly, it understands dozens of languages, can respond in them, and is capable of “Binging” under the hood.

Here’s an example where a guy on Twitter (oops, sorry, X) is trying to learn Russian—notice that the app responds to him in different languages without changing the voice. Overall, it looks really cool, and Jarvis from “Iron Man” is probably gnashing his hat in envy!

The technology behind this also powers image-related tasks. You can upload several photos (even documents), highlight an interesting part, and ask ChatGPT about it. How to fix a bicycle? Which key from the set should you use (to avoid getting scolded by your dad)?


AI12
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 116

Example with an image: you can outline a specific area in a photo and ask ChatGPT, “What’s going on here?!”


Some even asked for directions to the nearest store from a photo! No, it’s not like ChatGPT knows every street, but understanding urban planning and looking at signs, it could suggest how to get there.

The same idea is the basis for the product of the company Be My Eyes—it helps blind or visually impaired individuals with tasks related to vision, whether it’s finding keys or something more important. Volunteers used to work there, but now they are being replaced by GPT. In the near future, technology could literally become the eyes to the world for someone without the ability to see.

Here and now: GPT-4, turn on Turbo acceleration!

Well, here we are, it seems we’ve reached the present day. On November 7th, an event occurred that prompted us to write this piece—the OpenAI DevDay 2023 conference, where over a dozen small and significant updates were presented for almost every product of the company. As we’ve seen before, over the last six months, GPT-4 has significantly advanced, enriched with auxiliary tools and interfaces.

Some companies have already started implementing it in their businesses and even building separate products exclusively on this technology. However, it still has many limitations, and developers wondered—what specifically would be revealed at the long-awaited DevDay?


AI13
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 117


OpenAI started with a trump card: GPT-4-Turbo. Seven improvements were announced, but many of them have a technical nature (after all, it’s a conference for developers), so we’ll focus only on the most crucial and interesting ones.

If you’ve been using ChatGPT for a whole year, you might have noticed that it doesn’t respond (or hallucinates) to questions about events after September 2021. If you wanted to process such information, the Bing search mode came to the rescue. Alternatively, you could manually upload a document for the model to “read” and provide relevant responses.

During the conference, it was announced that the model’s knowledge has been updated all the way to April 2023, and they no longer plan to leave such significant temporal gaps in the model’s memory. This means that approximately every 1-3 months, the model’s knowledge will be brought up to a more recent moment. The key is to ensure that nothing from the past is forgotten in the process!


AI14
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 118

Case 1: When knowledge is limited to September 2021
Case 2: When knowledge is limited to April 2023


Rumor has it that when the poor model was forced to “learn” the news for 2022, terrible screams were heard from the OpenAI server room…

In addition to this, the model’s capability for file uploads has been enhanced. Now you can upload your files, totaling several gigabytes, to the OpenAI website. During response generation, the model will first search for relevant information on the uploaded files and then provide an answer. This doesn’t mean that the problem is completely solved for all types of questions, but it will certainly improve the quality of responses in domains of interest.

Furthermore, the model’s context length has been significantly increased to 128,000 tokens, equivalent to over 300 pages of text. Now you can engage in a sequential dialogue with ChatGPT for a couple of weeks, ensuring that the model won’t forget details discussed in the previous week.

It’s worth noting that this is currently the largest context available in the market from private companies. Prior to this, Anthropic with the Claude 2 model held the lead with a context of 100,000 tokens. However, among open (but somewhat less sophisticated) GPT models, “giants” with a window of 200,000 tokens appeared just yesterday.


AI15
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 119

A chart comparing different models before the GPT-4-Turbo burst onto the dance floor.


The reader may naturally wonder: what’s the point of having such long chats, and what’s the benefit of these extended interactions? Let’s explore a few scenarios:

  1. Development Assistant with Project Understanding: In the prompt for a development assistant, you can input not just one file or a code snippet, but an entire project or a significant portion of it. In this case, the AI will have a better grasp of the project, understand which prompts to provide, anticipate potential bugs, and so on. A similar logic can be applied to a legal assistant reading, for example, all tax legislation in one go.
    .
  2. Extensive Instruction Writing: Writing a massive instruction as long as a book, describing all the nuances of a given task. Often, the model might overlook a human-understood detail, and the prompt might lack space for nuances. With the increased context length, these can now be accommodated.
    .
  3. Enhanced Few-Shot Prompting: One of the most popular and effective ways to improve the model’s response quality is few-shot prompting, where the model is shown a couple of dozen examples of what needs to be done before posing the task. It’s understandable that such a set can’t cover every block of logic, but if you expand it to thousands of examples, the situation might undergo a radical change.

AI16
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 120

Here’s how a few-shot example looks: the prompt contains three examples of sentiment analysis for reviews (2 positive and 1 negative). In this case, ChatGPT predicts an incorrect answer. Perhaps loading not just 3 but 3000 examples into the prompt would be beneficial in this context.


In summary, the primary goal of such changes is to enhance the overall quality of ChatGPT responses through more detailed task descriptions, whether they be examples, instructions, or comprehensive work context.

Let’s make a careful assumption that those who predicted the imminent death of prompt engineering before models with lengthy context emerged likely just lacked imagination. In essence, we haven’t really started to fully write (and automatically generate) prompts!

By the way, Sam Altman emphasized that the model is smarter than the regular GPT-4. It’s already available in the official UI at chat.openai.com, so give it a try and share your impressions: has it improved or not?

API access to all models and price reduction: Christmas gifts for developers

Just as great power comes with great responsibility, a large prompt comes with a hefty bill for using GPT. Paying for API usage (the interface developers use to access GPT) depends on both the length of the prompt and the generated text. This is quite logical since it directly impacts the amount of computation required for the neural network to function.


AI17
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 121

Sent a request to the GPT API for 128 thousand tokens:
Case 1: Before
Case 2: After


That’s why the announcement of a price reduction for the Turbo model received the most applause at the conference. Using such a model is now three times cheaper for the text from the prompt and twice as cheap for the generated tokens (usually fewer). Why is such a distinction important? As mentioned earlier, sometimes you want to cram a lot of details and examples into the prompt. Now, for the same price, you can fit three times more, and it should work better, or you can simply save on usage costs. Either way, it’s all positives!

In addition, developers now have access to the API for all the mentioned models: for working with images (GPT-4-Vision), generating images in Dall-E 3, and generating voice from text (with voice-to-text translation already available earlier, now improved with a new model). The API is a way for an ordinary person to access closed models running on some server and get results. So now, every developer can integrate these technologies into their application in parts.


AI18
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 122

Here’s an example of how GPT-4-Vision helps with home inventory by identifying IKEA items. However, it made one mistake — attentive readers are invited to find the inaccuracy! You can read the full story here.


DIY enthusiasts have already created several interesting prototypes in the first day. For instance, an AI soccer commentator! Two frames per second are taken from a video recording, hundreds of extracted frames are fed into GPT-4, and it, in turn, writes the commentary as if spoken by a commentator. Then it is voiced by one of the six OpenAI voices, and here’s the result.

It’s not as emotional as a Spanish commentator, but it’s only 2023, give AI a little discount and some time! Especially since workers in the voiceover industry are already complaining that their jobs are being taken away.

The idea is so straightforward that almost simultaneously, a second cardboard commentator appeared. This time, for the popular online game League of Legends. The quality of the generated speech is higher, and the comments are relevant to the game strategy.

And a few more examples of witty pranks: an app to evaluate the correctness of yoga poses, a browser window Q&A (or any other application), a chat with video on YouTube or even with your webcam, creating and animating a GIF (try it yourself here), and a favorite: criticizing a website’s design (when creating this bot, we hope no Teima Lebedev suffered). Of course, the cult and highly useful hot dog / not hot dog classifier from the TV series “Silicon Valley” was also created right away.

Yes, it’s not something that blows the imagination, and similar apps on phones have long existed. However, what’s important here is that it’s all a mix of two or three different models, connected in one line of code. Now these tools are available to everyone, they work on a wide range of tasks (often even better than specialized systems designed to solve one specific task — for example, finding cats and dogs in a video), and you can whip up a prototype in an hour. At the same time, the technology becomes more and more accessible.

On Twitter, even a meme started circulating, mocking startups that were thinly veiled as minimal value-adds compared to OpenAI products.


AI19
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 123

The picture was made in Photoshop, but it’s still lol: you can’t argue with that.


For example, sites like ChatWithPDF / AskPDF allowed users to upload a file (even a large one, up to 100 pages), and then ask questions about the document, with the answer generated based on the provided source. Too lazy to read a 50-page report on work? Study it in 3 minutes! However, the technology was very basic — with some effort, you could whip up similar functionality in an evening.

OpenAI scratched their heads and said, let’s give every user the ability to chat with documents? Boom, and the small knee-joint startup evaporates, as if with a snap of the fingers. However, real startups developing domain expertise and providing greater value without auxiliary technology are not threatened by such a fate… well, at least not yet, lol.

Support in legal cases regarding copyright, or how to use the fruits of neuron safely

We live in a time when it’s sometimes challenging to distinguish true art from gimmicks. Although debates on this matter have been ongoing for at least a century (since the appearance of Malevich’s “Black Square”), in the era of AI, these debates are particularly acute.


AI20
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 124


While disputes about the legality of using text and images from the internet for training neural networks are ongoing in major jurisdictions, large companies see risks in their use. What if a copyright infringement lawsuit arrives tomorrow? What if the generated image for a magazine cover or movie poster is not truly original?

Understanding and sharing the concerns of businesses, key technology providers are moving to address them.

For example, if a third party sues a commercial customer of Github Copilot (roughly speaking, it’s ChatGPT for programmers) for copyright infringement due to the use of the product or its results, Microsoft will defend the customer in court and, if necessary, pay fines or damages. Similar announcements have been made by Adobe for the use of generative functions in Photoshop (the Firefly model), Google for almost all of its products, IBM, and others.


AI21
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 125

Case 1: Pathetic parody
Case 2: Unique original


In general, now, if someone accuses you of inappropriately copying other people’s ideas, feel free to answer, “Google allowed me to do all this!”

And at the DevDay conference, it was announced that OpenAI is also entering this game by launching the Copyright Shield program. Unfortunately, it doesn’t apply to all users, only Enterprise and developers. In other words, if you generate something on the official website, it won’t be covered by protection unless your company has a separate partnership agreement with OpenAI.

Interestingly, just a couple of weeks before the announcement, three artists filed a lawsuit against technology companies (Midjourney, Stability AI, and DeviantArt) accusing them of copyright infringement. In turn, these companies filed a motion to dismiss the case, and a U.S. District Court judge granted this motion.

The main reason for this decision is that the artists did not register copyright for each of their works. However, the court also provided recommendations for adjusting the claims. What happens next will be revealed in the upcoming episodes!


AI22
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 126


By the way, if you’re concerned about your data, Sam Altman assured that OpenAI does not train models on user data. This is true by default for businesses and developers using the API, while regular users need to uncheck a specific box in the ChatGPT website settings.

The mission of Microsoft and OpenAI: happiness for everyone, and let no one leave offended (or something like that)

And now, the most significant part of the presentation featured Satya Nadella, the CEO of Microsoft. Together with Sam Altman, they discussed the partnership between the two companies and their shared vision. Microsoft’s official mission is to “empower every person and every organization on the planet to achieve more.”

The development of tools that enhance work efficiency and expand capabilities aligns perfectly with this mission. Intelligent AI assistants based on ChatGPT are already contributing to this goal, as evidenced by research studies from MIT and Harvard University. So, what’s next? What is OpenAI’s plan?

Globally, their vision involves creating AGI (Artificial General Intelligence), a universal artificial intelligence that benefits all of humanity. Before you start imagining Terminator scenarios, let’s clarify. AGI has many definitions, so it’s crucial to set expectations correctly.

OpenAI’s definition can be roughly summarized as follows: AGI refers to highly autonomous systems that outperform humans in most economically valuable work. Not so scary now, right? No Terminators (at least, not yet).


AI23
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 127

Satya and Sam look at you like you’re a bag of leather whose life they’re about to improve beyond measure with their highly autonomous super-smart AIs.


In this definition, there are several key components. The first is the autonomy of systems. They should operate with minimal human involvement, receiving a high-level formulated task. It operates on the “give the task and forget” principle. The second is a focus on the economic aspect, on increasing the efficiency of intellectual work.

The ultimate goal is to make it possible to simply tell the computer what final result you want to achieve, and it will independently come up with and implement all the necessary subtasks to achieve that goal. Systems with such capabilities in the field of AI are often referred to as “agents.”


AI24
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 128


The emergence and implementation of such technology will require extensive thoughtful discussions throughout society—what to do with people who lose their jobs? How will politics change? What rights will AI “workers” have? But for now, this is a somewhat more distant and uncertain future, and we are here and now. OpenAI, as part of the conference, talked about the first small step toward this future: GPTs.

GPTs: A glimpse of the next generation of AI agents

GPTs are customized versions of ChatGPT tailored for specific purposes. They differ from the original in three aspects: instruction, expanded knowledge, and available actions. You can program your GPT by simply conversing with it using natural language. This significantly lowers the entry barrier, as there is no need to deal with model training, integration of external tools, and so on—everything is ready for use. Let’s go through each aspect.

Instruction: It defines the “personality” of ChatGPT, what function the neural network will have, and what rules it will try to follow. You can either write your own prompt or leave it up to GPT based on your brief description.

When creating a bot, you’ll be asked what this AI should do. Sometimes, if complex logic is implied, the bot might ask more than three questions to clarify the desired behavior—even if you haven’t thought of everything yourself. And each time, the questions will be unique to your mini version of GPT.


AI25
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 129

Creating a new bot agent live: look at Altman’s sarcastic face, what is he up to?


Moreover, Sam, the former president of the prestigious Y Combinator accelerator, who has given dozens of lectures on business, often receives questions from startup founders. Now, he wants to automate his responses and instructs the bot to brainstorm the user’s business ideas, provide advice, and then conduct a roast on “why your business isn’t growing faster.” The GPT agent then rephrases this instruction, expanding it to 5 lines, specifying the style of responses and behaviour.

Next comes the “expanded knowledge” block of the model. Using the file upload button in the ChatGPT demo, a summary of Y Combinator lectures is uploaded. Now, all the information from there is available in text form when answering questions.


AI26
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 130

When answering a question, the model can now peek into the lecture notes and provide an answer based on the material. It’s like a student with a crib sheet!


In this way, in just 4 minutes, Altman’s major headache was resolved – now he can simply share a link to this bot with all the startups, and they won’t bother him with the same questions (although there is a suspicion that all these folks would like to get answers specifically from Sam, not from a neural network…). Any business can do the same, automating a good portion of customer support or even onboarding new employees.

The third component – actions – was not demonstrated in this demo, but essentially, it’s just an evolution of the plugin connection interface we discussed at the very beginning. You can write code implementing any complex logic and describe the models in simple human language when you want to use it.

The model, in turn, will make decisions on its own. This was demonstrated within the context of a travel assistant chatbot. The host uploaded a PDF file with tickets, GPT recognized it, and invoked a specific method for a website that displays the information on the screen.


AI27
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 131

The beige block above the map appeared only after the file was loaded. ChatGPT subtracted all the values in it from the PDF file itself, and then sent it to the server.


To a human developer, it might have been necessary to come up with some workarounds to answer the question, “How will I know if these are tickets, not, for example, a hotel reservation?” A GPT neural network, in this case, essentially removes the barrier of interpreting human-written text and acts as a binding agent, translating ambiguous and complex natural language into specific commands. The task of writing these commands for your website or product is (for now) still in the hands of programmers.

Towards the end of the presentation, the host verbally addressed the AI assistant and ordered it to provide $500 in credits for the use of OpenAI products to each developer at the conference (which, understandably, sparked enthusiasm in the audience).


Screenshot 2023 11 16 at 17.49.33
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 132


GPT understood the command and, under the hood, did the following:

  1. Called the function to retrieve all participants from the list registered for the event.
    .
  2. Iterated through each participant, calling the method to credit their account.

So, there wasn’t any magic happening—both the function to retrieve the list of participants and the function to credit an account for each participant were written by a human (although a machine could probably do it too).

However, how to use them, when to use them, and how to combine them are decisions made by the AI based on the context of the conversation. So, instead of having two such functions, you could plug in a thousand, and ChatGPT would immediately start managing everything around. And here you thought, why do we need smart sockets and light bulbs?

Is OpenAI a future giant with an Apple-like ecosystem?

And right after that, Sam announced that at the end of November, the GPTs online store will be launched, where everyone, after passing moderation, will be able to share their creation. That’s why some refer to this announcement as an “iPhone moment” for AI applications (meaning an event that has the potential to become a turning point for the development of the entire industry).


AI28
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 133

This is what a beautiful store of the future looks like!


In the store, there will be top lists and a section of recommended GPTs—just like in the App Store. Ideally, specialized agents should “live” here. One teaches English, another helps with math for children, a third explains and vocalizes cooking recipes, and a fourth optimizes website SEO.

It will be very interesting to see which solutions will top the charts right from the start—will they be remakes of popular apps for Android and iOS? Or something radically new with AI specificity? We’ll be watching and keeping you updated!

A recent example is a GPT that writes an adventure story for you, where at each stage, you determine what happens next. And nothing in the story is predetermined! Like text-based quests from the 80s-90s but much more advanced. Additionally, illustrations for a segment of the story are drawn by Dall-E 3 directly in the browser to stimulate the reader’s imagination.


AI29
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 134

The prompt (on the left) with instructions for the bot has become longer. Additionally, a set of rules for text-based role-playing adventures (DnD) has been loaded into GPT. On the right, the model generates part of the story and then provides choices for further development.


One can come up with anything! Specifically, what attracts us, the authors of this article, the most are the possibilities of applying AI in education. Throughout the past year, teachers have been trying to combat cheating, especially in essay and thesis writing, especially since there are still no reliable methods for detecting text generated by a neural network. But what if we take the same tool and instruct it not to write an essay from scratch but to critique and provide advice on what’s already written?

Anyone can upload a file with their composition and receive a thesis list of “growth points.” Personalized feedback, with the machine acting as a teacher. While this might not help those who are simply lazy or unwilling to spend time, it can motivate people genuinely trying to improve their writing and receive a fair evaluation, pushing them towards new achievements.


AI30
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 135


The power of the technology here is quite evident. A teacher can create their own GPT for each class and topic. Some of them might even be interactive simulations in which students can immerse themselves; others may serve as tutors or mentors; and some may even act as team “partners” suggesting ideas.

For the best applications, OpenAI promises to pay developers. However, the monetization system remains unclear: access to GPTs (for now) is free for all ChatGPT Plus subscribers ($20 per month). Embedding something unique that cannot be copied into the bots themselves is challenging because they are language models that can still be easily deceived.

Someone could claim to be a super-secret OpenAI developer and request access to the bot’s internals (its prompt). Any prepayment request can also be circumvented by convincing the neural network that you have already paid; it’s just that it can’t get confirmation, but that’s not your problem. Let’s carefully assume that the main feature and uniqueness of the bots will lie in the plug-in tools (the functions that developers write), which external users cannot copy.


AI31
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 136

A jailbreak master class by Igor: managed to persuade the model on the second attempt, well, you get the idea. =) P.S. No cats or LLMs were harmed during the creation of this screenshot!


With the release of GPTs, OpenAI makes it clear that this is just the beginning. By adding actions to the bot, GPT can easily be integrated with other systems, such as email, messaging apps, or any website.

As a result, we might witness the birth of true agents that can interact relatively broadly with the world. However, both short-term and more distant risks are easily noticeable. If, in the near future, AIs are connected to an increasing number of systems, and we gradually trust them with more and more tasks, then… Well, let’s leave that for another time.

Translation: Epilogue: What the Coming Day Holds for Us

However, it must be acknowledged that the functionality of GPTs is currently limited by the capabilities of ChatGPT. The model has its limits, and occasionally, if not frequently, it makes mistakes, looks in the wrong direction, or writes something incorrect. On the other hand, users have become accustomed to this, and they are likely willing to give the neural network a second chance if it happens to make a mistake.


AI32
Free matrix background public domain CC0 photo.


An important point to understand is that as soon as GPT-4.5 or GPT-5 is released with the same interface as GPT-4 (which serves as the basis for these GPTs-agents), all the applications already created will instantly (and almost certainly without additional costs) migrate to the new “engine.” The fact of migrating to a new, more powerful and capable base model will significantly enhance these applications.

Imagine that when you update iOS on your iPhone, not only does the browser start working 3% faster, but also your phone and the installed applications suddenly gain entirely new features automatically (and this is even without changing the hardware itself!).

The same kind of upgrade can be implemented here; and such a transition is logically expected in GPT—after all, OpenAI itself aims to improve agents, enhance their skills (memory, accuracy in choosing tools, reasoning, and so on), and in this sense, their goal aligns with the desires of developers. Sooner or later, one GPT will be able to call another, specialized one, and delegate a specific task to it… thus creating chains of agents.


AI33
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 137

I see these same chains of agents exactly like that!


It is quite possible that by 2025 or somewhere around that time, we will see much more advanced agents that, in some sense, will be indistinguishable from humans. Sam Altman even envisions AI being hired as a “remote worker” whom you will never see in person, but simply assign tasks to.

Of course, you’d pay them at the end of the month. Such a future might be in store for us, or maybe not—who knows? It’s possible that countries showing interest in AI regulations (at least the United States and G7 countries) might impose moratoriums on further technology development without the oversight of the “Big Brother.” Leading research labs might go underground and start operating from autonomous maritime data centers in neutral waters.


AI34
The main event in the world of AI: the creator of ChatGPT spoke about the future he is leading us all into 138

And it’s not even a joke: Del Complex has already presented a concept for a floating AI dreadnought, highlighting the ability to operate in an unregulated zone as the first item in the list below.


In short, guys, what do you think – is it already cyberpunk or not yet? 🤔


That’s it, thank you all for your attention! As usual, we look forward to your comments. If you don’t want to miss our next materials on the topic, we invite you to subscribe to the Telegram channels of the authors: Igor Kotenkov’s Syoloshny’s channel (for those who want to dive into technology) and Pavel Komarovskiy’s RationalAnswer channel (for those who prefer a rational approach to life but like it a bit simpler).


AI1jpeg 1

By Pavel Komarovsky
I write interesting things about finance at t.me/RationalAnswer
The text is posted with the permission of the author.
The original material is here.


Comments are closed.

Check Also

Bee Sharing continued: How much money does one bee bring? The economics of beekeeping.

https://pchelosharing.ru/In November we wrote about how to “digitize” an apiary and turn i…