Before Going to Tokyo, I Tried Learning Japanese With ChatGPT

On the final day of my visit to Japan, I’m alone and floating in some skyscraper’s rooftop hot springs, praying no one joins me. For the last few months, I’ve been using ChatGPT’s Advanced Voice Mode as an AI language tutor, part of a test to judge generative AI’s potential as both a learning tool and a travel companion. The excessive talking to both strangers and a chatbot on my phone was illuminating as well as exhausting. I’m ready to shut my yapper for a minute and enjoy the silence.

When OpenAI launched ChatGPT late in 2022, it set off a firestorm of generative AI competition and public interest. Over two years later, many people are still unsure whether it can be useful in their daily lives outside of work.

A video from OpenAI in May of 2024 showing two researchers chatting back and forth, one in English and the other in Spanish, with ChatGPT acting as a low-latency interpreter, stuck in my memory. I wondered how practical the Advanced Voice Mode could be for learning how to speak bits of a new language and whether it’s a worthwhile app for travelers.

To better understand how AI voice tools might transform the future of language learning, I spent a month practicing Japanese with the ChatGPT smartphone app before traveling to Tokyo for the first time. Outside of watching some anime, I had zero working knowledge of the language. During conversation sessions with the Advanced Voice Mode that usually lasted around 30 minutes, I often approached it as my synthetic, over-the-phone language tutor, practicing basic travel phrases for navigating transportation, restaurants, and retail shops.

On a previous trip, I’d used Duolingo, a smartphone app with language-learning quizzes and games, to brush up on my Spanish. I was curious how ChatGPT would compare. I often test new AI tools to understand their benefits and limitations, and I was eager to see if this approach to language learning could be the killer feature that makes these tools more appealing to more people.

Me and My AI Language Tutor

Jackie Shannon, an OpenAI product lead for multimodal AI and ChatGPT, claims to use the chatbot to practice Spanish vocabulary words as she’s driving to the office. She suggests beginners like me start by using it to learn phrases first—more knowledgeable learners can immediately try free-flowing dialogs with the AI tool. “I think they should dive straight into conversation,” she says. “Like, ‘Help me have a conversation about the news on X.’ Or, ‘Help me practice ordering dinner.’”

So I worked on useful travel phrases with ChatGPT and acting out roleplaying scenarios, like pretending to order food and making small talk at an izakaya restaurant. Nothing really stuck during the first two weeks, and I began to get nervous, but around week three I started to gain a loose grip on a few key Japanese phrases for travelers, and I felt noticeably less anxious about the impending interactions in another language.

ChatGPT is not necessarily designed with language acquisition in mind. “This is a tool that has a number of different use cases, and it hasn’t been optimized for language learning or translation yet,” says Shannon. The generalized nature of the chatbot’s default settings can lead to a frustrating blandness of interactions at first, but after a few interactions ChatGPT’s memory feature caught on fairly quickly that I was planning for a Japan trip and wanted speaking practice.

The “memory” instructions for ChatGPT are passively updated by the software during conversations, and they impact how the AI talks to you. Go into the account settings to adjust or delete any of this information. An active way you can adjust the tool to be better suited for learning languages is to open the “custom instructions” options and lay out your goals for the learning experience.

What frustrated me most was the incessant, unspecific guideline violation alerts during voice interactions, which ruined the flow of the conversation. ChatGPT would trigger a warning when I asked it to repeat a phrase multiple times, for example. (Extreme repetition is sometimes a method used by people hoping to break a generative AI tool’s guardrails.) Shannon says OpenAI rolled out improvements related to what triggers a violation for Advanced Voice Mode and is looking to find a balance that prioritizes safety.

Also, be warned that Advanced Voice Mode can be a bit of a yes-man. If you don’t request it to role-play as a tough-ass tutor, you may find the personality to be saccharine and annoying—I did. A handful of times ChatGPT congratulated me for doing a fabulous job after I definitely butchered a Japanese pronunciation. When I asked it to provide more detailed feedback to really teach me the language, the tool still wasn’t perfect, but it was able to respond in a manner that fit my learning style better.

Comparing the overall experience to my past time with Duolingo, OpenAI’s chatbot was more elastic, with a wider range of learning possibilities, whereas Duolingo’s games were more habit forming and structured. Are ChatGPT’s language abilities an existential threat to Duolingo? Not according to Klinton Bicknell, Duolingo’s head of AI. “If you’re motivated right now, you can go to ChatGPT and get it to teach you something, including a language,” he says. “Duolingo’s success is providing a fun experience that’s engaging and rewarding.”

The company partnered with OpenAI in the past and is currently using its AI models to power a feature where users can have conversations with an animated character to practice speaking skills.

Putting ChatGPT to the Test in Tokyo

ChatGPT really became useful when I wanted to practice a phrase or two before saying it while out and about in Tokyo. Over and over, I whispered into my smartphone on the sidewalk, requesting reminders of how to ask for food recommendations or confess that I don’t understand Japanese very well.

Using Advanced Voice Mode to translate back and forth live may be great for longer conversations you’d want to have in more intimate settings, but at a buzzy restaurant, crowded shrine, or other common tourist spots in Japan, it’s just easier to do asynchronous translations with the tool.

At a barbecue spot with an all-you-can-drink special and a mini-keg of lemon sour right under the table, the food came out but not the requested drinking mugs. I had a tough time requesting them. The waitress was patient with us as I spoke a few lines into ChatGPT and showed her the translation on my smartphone. She then explained I hadn’t yet signed a waiver promising not to drink and drive and brought out a form to sign. A few minutes later, she returned with the mug. In this instance, OpenAI’s chatbot was quite helpful, but I likely would have been just fine using the Google Translate app.

More times than I would like to admit, though, the phrases I thought I had down pat by practicing with ChatGPT ended up sloshing around in my head and embarrassing me. For example, while trying to get back to the hotel around 10 pm via the train, I got disoriented looking for the correct station exit. I was able to ask for help from one of the station staff members, but instead of saying “thank you” (arigato gozaimasu) at the end, my tired mind blurted out the phrase for “this one, please” (kore wo onegaishimasu) as I confidently strode away.

After a month of ChatGPT practice, did I really know Japanese? Of course not. But a few of the polite greetings and touristy phrases stuck well enough, most of the time at least, to navigate my way around Tokyo and feel like I could really enjoy the thrill of adventure in a new country.

As generative AI tools improve, they will keep getting better at helping language learners practice speaking skills, as well as their reading skills. Tomotaro Akizawa, an associate professor and program coordinator at Stanford’s Inter-University Center for Japanese Language Studies in Yokohama, gives me an example. “Students who have just completed the beginner level can now try to read challenging literary works from the Shōwa era by using AI for translations, explanations, and word lists,” he says.

If students eventually end up relying only on generative AI tools and go their entire language learning journey sans human instructor, then the complexities of spoken language and communication may get flattened over time.

“The opportunity to personally experience the human elements embedded in the target language—such as emotions, thoughts, hesitations, or struggles—would be lost,” says associate professor Akizawa. “Words spoken in conversation are not always as structured as those from a large language model.” AI may be more patient with you than a human tutor, but language learners risk losing the rough edges and experience-based insights.

Have you tried to learn to do anything with AI? Would you feel confident using AI to help with translation in public? Let us know your experiences by emailing hello@wired.com or commenting below.