Ever heard of this AI that will transform every
aspect of our lives, make human intelligence obsolete and accelerate the downfall of normal
brains back to the Stone Age? Yes, you guessed it! Today we're talking about Chat GPT and more
specifically Chat GPT in the context of language learning. I know you have already seen countless videos
on this very topic. So let's try to bring it to the next level, debunk some of the myths generally
associated with Chat GPT, understand why it tells you so
much c*** and understand its strengths
and weaknesses so that we can use it properly for language learning. Of course, if understanding
how GPT works, what it can and cannot do does not interest you, please use the timestamps down below
to go to the part of the video that interests you the most. If Chat GPT is to become our next best
language teacher, it is important to understand how it works and also figure out whether or not
it understands what we are and what it is saying. Chat GPT is
a chatbot that was developed by Open AI
and launched at the end of 2022 for everybody to use. As of June 2023, Chat GPT is not connected
to the internet. Chatbots are nothing new. You find them pretty much on any commercial website
nowadays. They usually take the form of a small tab that pops up on screen to ask you how they can
help you. These chatbots usually work according to keywords and depending on whether or not you find
the right keyword to formulate your question, they may or may n
ot be very helpful to you. Chat GPT on
the other hand is a chatbot that is based on a more advanced technology called "machine learning".
The most widespread machine learning model is called "supervised learning" and it is usually
used for spam detection or image recognition. You can train a machine to recognize what is
on an image: you create a model with a series of features, you use an image as input and the
model will provide a specific word as output. At the start the model will tell y
ou a lot of c***
until it is trained enough to output the right word. The model's features will be tweaked along the
way until it reaches the highest probability of providing the correct answers. This model -although
very interesting, requires an extremely long training period. It also has a major disadvantage
as it requires a lot of human input. However, this is not the way Chat GPT works. Chat GPT was thought
out to take the form of a conversation. So this machine learning model was not ap
propriate. Rather
Chat GPT is based on a baseline model i.e., a model that was first trained to perform generic tasks
before being tweaked to perform more specific ones. In the case of Chat GPT, the baseline model
is none other than GPT, which stands for Generative Pre-trained Transformer. Interestingly enough,
this baseline model was trained to perform a most unusual task, that is guessing the upcoming
word in a text. So the input would be any word, any part of sentence or even any text wh
ile the
expected output would be any word that would be a plausible completion to said input. An example
would be the "mouse is eaten by ... "with "the cat" as output. You may wonder why anybody would want to
train a machine to guess the upcoming words in a text as it doesn't really serve much purpose.
Well the great thing with this strategy is that it doesn't require a lot of human supervision
at all as you can simply train the model using an existing corpus of data i.e., any text, any boo
k
or encyclopedia that's ever existed in digital form. For instance, GPT was fed the entire English
version of Wikipedia. Not sure it's a guarantee of quality but that's another matter... As with other
machine learning models, the features of the model were tweaked, adjusted along the way to ensure it
gives the most plausible answer; that is the word that is most likely to appear as completion
of a specific sentence, phrase or text in input. GPT3 is equipped with no fewer than 175 billion
parameters. It is important to understand that GPT was not trained to provide true answers. The
model was trained to provide a "plausible output". For instance, if you ask a GPT when Christopher
Columbus discovered America, it will tell you 1492. Not because it knows it as a historical
fact but because most -if not all - the texts it was trained with associate Christopher / Columbus /
discovered / and America with this very date. However, the proof that it doesn't understand what you are
ask
ing is that if you change "Christopher Columbus" for another name, it will give you the exact same
answer. I was not able to access GPT myself as you apparently need an API access to be able to use
it and I just couldn't be bothered. However, if you you would like to have more examples, I would
encourage you to check out the video on this very topic that was made by David Louapre from the
French channel "Science étonnante". It's a French channel so obviously it's going to be in French ^.^. B
ut there's always
the possibility to use automatic subtitles if you do not understand French. The information I
have provided is a quick summary of this video. So if you want more details, please check it
out. Besides the fact that it was not trained to produce correct answers, it also does not only
produce sentences that it has learned. It rather endeavors to produce the most plausible output, as
in anything that would not clash with existing texts. This model presents an obvious advantage
: you
can repeat the task endlessly so as to produce an entire text based on a single starting word. When it comes to languages, GPT was trained using different languages. However, about half of
its training was done in English, which causes a major problem as we will see later in this video.
Unlike what people tend to assume, GPT does not have any memory of any conversation you may have with it. Neither does it know anything beyond 2021. It is a frozen model. Its knowledge stops at a
certai
n date. Of course, this may change when it gets connected to the internet but for now
that's the way it is. Now that we understand the baseline model on which GPT is based, let's
learn a little bit more about ChatGPT itself. As I said GPT was not trained to be a chatbot.
It was trained to complete sentences or texts. Chat GPT uses a pre-established prompt which
is an introductory phrase or paragraph that will help the machine to understand what you
are expecting from it. For instance "This i
s a conversation between a human and a well-learned
chatbot'". We do not know what pre-prompt is used exactly in the case of Chat GPT. But we know that it is
most likely a long list of instructions. Since GPT is trained to complete sentences or texts
and not to have conversations. It was "fine-tuned", using a database of real human answers. The third
and last training phase is called "reinforcement learning" and during this phase humans judged the
quality of the answers that were provided b
y GPT. With this short summary, we now understand that
Chat GPT does not think, it does not ponder, it does not reflect. Neither does it have any memory
of any kind of conversation you may have had with it. And it is not connected to the internet. It
generates plausible answers in which words go well together based on its database. Now that
we know this, let's try to find out if GPT is an appropriate language teacher... Since Chat GPT was trained to be a chatbot, it can potentially break dow
n complex grammar points to adjust them
to the normal flow of a conversation that would be easier to understand and more user friendly. In my
previous video on how to start reading in Arabic or in any other language, I broke down some of the
most important elements of the Arabic grammar, one of which was none other than the "Masdar form". So
I asked Chat GPT to teach me the Masdar form of the first 4 verb forms. "In Arabic grammar, the
Masdar form is a verbal noun that represents the action
or the concept of a verb. It is derived from
the root letters of the verb. Here are the Masdar forms of the first 4 verb forms along with
simple examples and their English translations". The 1st verb form apparently is "fa'la", which is
a form I did not know existed... It tells us that the Masdar form is "fi'lun". Well I suppose that if the 1st verb form was "fa'ala", "fi'lun" would indeed be a possibility. It is for instance the Masdar form of the verb " 'alama" (to know)- 'ilmun (knowledge
). Interestingly enough, for the 1st verb form, there are different Masdar forms, which it
does not seem to explain here. The 1st verb form explanation is consequently quite problematic,
especially if we consider that in the example قرأت الكتاب. القراءة ممتعة. well the example does not match at all
the Masdar form that was provided. Let's move on to the 2nd form. It tells us
that the 2nd form is "fa'aala", which is correct and then that the Masdar form is " taf'iil', which
is correct too.
However, the example is صممت الخطة. التصميم جيد. "Tasmiim" is indeed the Masdar form of verb "sammama". Unfortunately, "el jatat" is not properly vowelised. It should have been "eljuttat". Let's see what it tells us for
the 3rd form. The 3rd form according to Chat GPT is "faa'ala", which is correct. But it tells us
that the Masdar form is 'fa'aala", which is not correct at all... Funny enough in the example سجدت في صلاة. السجود واجب. Here, the vowelisation is correct... Unfortunately, none of
the words that appear in this example belongs to the 3rd form! Finally, it tells us that "af'ala" is the
4th form - this is correct. And that the the Masdar form is "if'al", which is correct too. And
then it provides the following example سعدت بالنجاح. السعادة مهمة. Unfortunately, again the vowelisation
is incorrect. It should have been سَعَدتّ بالنجاح. السَعادة مهمة. And the 2nd problem is obviously that this
example does not include any word that could potentially belong to the 4th verb f
orm. I was
quite confused with this 1st conversation so I asked Chat GPT to provide more explanations when it
comes to the example sentence it provided for the 3rd verb form. I asked "You are an Arabic expert
tell me if the 3rd form of Arabic appears in this sentence." It responded "Yes, the 3rd
verb form "fa'aala" (which is definitely not the 3rd verb form) of Arabic does appear in the sentence
you provided. The verb form you are referring to is "sejeda" in the past tense, which means to
"
prostrate". In the sentence, it is conjugated as "sajadto." The content of the sentence makes me
understand that Chat GPT does not understand anything it is saying about the Arabic grammar.
Since it was telling me a lot of c***,I tried to stir the conversation in the right direction by
asking "Are you sure the verbal noun of the 3rd verb form is fa'aala". "You are correct and I apologize for the mistake. The Masdar of the 3rd verb form "fi'aala" in Arabic is not "fi'aala" but rather "istif'aal
!! I stopped the conversation here simply because it went on like that for 2 or 3
more answers in which it provided completely nonsensical answers. At least for anybody who knows
the Arabic grammar, which tends to indicate that it will go on like that until you are satisfied with
the answer it provides. Obviously, it's impossible to check all the aspects of the Arabic grammar
in one single video but if it struggles with something as simple and straightforward
as Masdar, it's unlikely that i
t would be much more competent for other more complex
aspects of the Arabic language. But I might be wrong. Chat GPT can also be used to practice
reading especially at the beginning. Reading Arabic without vowels can be quite challenging
especially as a beginner. So you can ask Chat GPT to write short stories for you while adding
vowelization. So I asked Chat GPT to provide a short story in Arabic about Japan and to make
sure that every word was vowelised. The text reads كَانَ يُوسُفُ وَاح
ِدًا مِنْ الشَبَابِ الَّذِينَ كَانُوا يُحِبُّونَ الثَّقَافَةَ الْيَابَانِيَّةِ. لَقَدْ قَرَأَ الْعَدِيدَ مِنْ الْكُتُبِ وَشَاهَدَ الْعَدِيدَ مِنَ الْأَفْلَامِ عَنْ هَذَا الْبَلَدِ الْبَعِيدِ. طَمَوْحُ يُوسُفِ كَانَ أَنْ يَزُوْرَ الْيَابَانَ يَوْمًا مَا وَأَنْ يَسْتَكْشِفَ جَمَالَهَا الْفَرِيدَ قَرَّرَ يُوسُفُ أَنْ يُقَوِّمَ بِرِحْلَةٍ إِلَى الْيَابَانِ. حَضَرَ جَوَازَ سَفَرِهِ وَأَمْوَالَهُ وَبَدَأَ الْبَحْثَ عَنْ تَذَاكِرِ الطَّيَارَانِ. We're about halfway through and there are
already quite
a few mistakes, so I definitely encourage you to be cautious here too. The fact
that there are so many problems when it comes to vowels in Arabic might be a little bit strange. However, if we go back to the 1st part of the video and remember that GPT was trained using a
corpus of existing texts, we can easily understand why there are so many problems as most existing
texts in Arabic do not include vowels. Of course, you could argue that many short stories or even
novels have a lot of similar
typos. However, the problem here is that it's supposed to teach me a
grammar point that it does not seem to understand and also that it tells you a lot of c*** in a
very confident way, which in my opinion is quite problematic. Even though Chat GPT is not perfect
when it comes to Arabic grammar and vowelisation, it does do quite a good job at explaining the
meaning of words. For instance, here, in Japanese I asked the meaning of the expression "umaga au" and it provided a most adequate answe
r. It's also quite good at explaining slang terms in French. So
even though it's not the best grammar teacher, it is a good vocabulary assistant. Now
Chat GPT of course can be used for many other purposes when it comes to language learning. One
of which is actually listening and speaking. You can do so by adding a Chrome extension, either
"Talk to Chat GPT" or "Voice control for Chat GPT" Both extensions come with a lot of different
features. You can for instance adjust the "read out loud sp
eed" to make it faster or slower. You
also have the possibility to choose your voice preference, so if you prefer the voice of a man or
a woman. However, most voices (if not all) sound quite artificial... Depending on your needs, you can also
make the input and output languages different. Both extensions allow for oral conversations. However ,
I do not use these extensions for this purpose, mostly because I cannot bear the sound of these
robot sounding voices. I personally prefer to use thes
e extensions to produce texts on topics
that interest me, which can then be read out loud for pronounciation purposes or to
train listening comprehension skills. Of course, you can generate texts based on your current
level, for instance by asking Chat GPT to only use vocabulary known to primary school students,
junior high or senior high students. With both extensions, you can play with speed adjustments,
depending on the kind of training that you wish to have. Of course, you can also impo
rt your own
texts or articles, short stories, parts of novels that you would like to read or be read out loud
for listening comprehension training. You simply need to copy and paste the text into Chat GPT
and then activate the sound. For further training, you can ask Chat GPT to ask you questions
about the content of the text you have just read or listened to and respond in your target
language. Unfortunately, these extensions do not yet include Arabic. So listening and speaking
skills cann
ot yet be trained on chat GPT using these extensions. In my opinion, these do
not replace at all the many benefits of having a language partner, mostly because conversations are
also about facial expressions, tone, filling words, which you will not get with this chatbot. That's
all for today's video. It ended up being quite long. At least, I hope that you got to understand
how Chat GPT works if you did not know already and also understand that it still has many flaws and
that it's not neces
sarily the go-to resource if you want to master a language. Of course, this may
change the day it gets connected to the internet but for now, it's not the case. Thank you for
watching and I will see you in the next one ^.^
Comments