On the Performance of the Japanese-Built AI Model, Tsuzumi

This article is an English translation of an article written by the author in Japanese at pickerlab.net.

The rapid pace at which large language models (LLMs) and generative AI tools such as ChatGPT, Gemini, and Claude are being released has created what can only be described as a competitive arms race in AI development. As a Japanese professional working in data science and natural language processing (NLP), I often find myself discussing these advancements with clients during casual conversations. Inevitably, the question arises: “How do domestically developed Japanese LLMs compare?”

To put it bluntly, my perspective—and one that I believe is shared by many others deeply involved in AI and NLP—is that it is virtually impossible for Japanese-developed models to compete on a global stage. At least, that’s my honest assessment.

To provide some context, back in the era when BERT was the dominant language model, international models often struggled with Japanese language comprehension. This left room for Japanese developers to create language models specifically tailored for Japanese, which had practical value within the domestic market. However, with the release of models like GPT-3.5, the situation changed dramatically. These international models demonstrated an astonishingly high level of Japanese language understanding, leading many experts, myself included, to feel that the role of Japanese developers in LLM development has significantly diminished.

The underlying reason for this shift lies in the way LLMs process language. These models convert natural language—whether it’s Japanese, English, or any other language—into numerical vector representations. Once the text has been accurately encoded as vectors, the processing that follows is largely language-agnostic. In other words, the distinction between languages becomes irrelevant at the computational level.

As a Japanese professional observing these developments, it’s clear to me that the technological gap between domestic and international players in AI has widened.

The Reality of Domestic LLMs in Japan

Developing large language models (LLMs) requires an immense investment, with costs reportedly ranging from tens to hundreds of billions of yen, and daily operational costs nearing 100 million yen. Frankly, no Japanese IT company has the financial capacity to sustain this level of investment. Even the largest IT firms in Japan would quickly fall into the red if they attempted it (and it’s worth noting that even OpenAI operates at a staggering deficit).

As a result, the general consensus among professionals in the AI and data science industries is to avoid engaging with domestic LLMs and not to place any expectations on them.

That said, for users who have recently taken an interest in AI due to the rise of generative AI, it might seem natural to wonder if Japanese-made AI would be better suited for use in Japan. Personally, however, I find it troublesome when such expectations are placed on me. When asked, “What about domestic LLMs?” during work discussions, I usually respond immediately with, “There’s no need to consider them.”

At the same time, I’ve started to feel that dismissing domestic LLMs outright without even trying them is somewhat unprofessional. As someone who is paid for their expertise, it’s not entirely fair to judge without direct experience.

With that in mind, I decided to conduct a very simple test of “Tsuzumi,” a domestically developed LLM by NTT Data, which is accessible through Azure OpenAI Service.

Testing Tsuzumi 7B, GPT-3.5 Turbo, GPT-4.0, and GPT-4o with 13 Questions

I conducted a simple accuracy test using 13 questions I selected from various sources, including the Japanese university entrance exam (Center Test), employment aptitude test (SPI), and general knowledge questions (economics and law) as bentimark.

Using a scoring system where a correct answer earns 1 point, a partially correct answer earns 0.5 points, and an incorrect answer earns 0 points, I calculated the percentage of correct answers for each model. A perfect score of 13/13 would correspond to 100%.

Question Selection Criteria

The questions were chosen entirely at my discretion and designed to be challenging, particularly for LLMs. They included:

  • 4 general knowledge questions (economics and law)
  • 3 math questions from Japan’s university entrance exams (Center Test/University Common Test)
  • 6 reading comprehension questions from employment aptitude tests (SPI)

    The difficulty level was intentionally set so that GPT-3.5 Turbo would struggle, while GPT-4.0 might have a chance of achieving full marks.

Example Question

To give an idea of the type of problems used, here’s an example math question:

A theater group’s total number of members decreased by 40% from last year, leaving 480 members this year. By gender, the number of women decreased by 25%, while the number of men decreased by 62.5%. Calculate the number of women in the theater group this year. (Round to the nearest whole number if necessary.)

The correct answer is 360 women.

This kind of calculation question is representative of the math problems included in the test. Results for each model will be discussed in the next section.

As for the reading comprehension questions, they included typical problems like selecting the correct conjunction to fill in a blank or choosing a sentence that does not contradict the target passage. These are common in the Japanese Center Test and similar exams.

Tsuzumi performed worse than GPT-3.5 Turbo.

The results showed GPT-4o achieving a 77% accuracy rate, GPT-4.0 at 53%, GPT-3.5 Turbo at 12%, and Tsuzumi at just 4%. This means Tsuzumi’s performance was below GPT-3.5 Turbo.

Tsuzumi managed to score only 0.5 points by partially answering just one knowledge-based question. It struggled with reading comprehension questions, often failing to understand the instructions, making it impractical for use.

Additionally, Tsuzumi’s performance worsened as input prompts increased, suggesting that it’s not suitable for handling large volumes of text, such as in RAG (Retrieval-Augmented Generation) systems. Since my purpose was to evaluate whether Tsuzumi could be integrated into an RAG system, I concluded that it’s unlikely to be a viable option.

Even with input lengths of about 1,000 characters, Tsuzumi felt inadequate compared to GPT-4o, which can handle and understand texts as long as 10,000 characters with much greater accuracy.

The results are understandable given the model’s scale.

To be honest, the performance of LLMs (Large Language Models) largely depends on two factors: the size of the training dataset and the number of parameters in the model. The number of parameters corresponds to the “nodes” in the model’s neural network, and the dataset represents the amount of information the model has “studied.” Simply put, a model with more parameters and a larger dataset will naturally perform better.

Of course, training methods and model architecture also matter. However, major overseas models are developed by highly skilled engineers earning millions, so I assume their design and training processes are top-notch.

Creating datasets is also costly, and the computational resources required to train a model increase exponentially with the number of parameters. This significantly drives up development costs.

Tsuzumi, with its 7 billion parameters, is modest compared to recent models with over a trillion parameters. It seems to have been developed with a more constrained approach, likely avoiding the “arms race” of massive budgets. In that sense, achieving this level of performance with 7 billion parameters is impressive. For reference, GPT-3.5 Turbo is rumored to have hundreds of billions of parameters.

From a parameter perspective, Tsuzumi might be performing well. However, based on my experience using it, it doesn’t seem suitable for practical use cases.

In IT business applications, Japanese companies should focus on steady adoption of best practices rather than rushing to catch up.

This might be a side note, but Japan’s IT industry, which is several years behind global trends, doesn’t need to compete with overseas players. I believe domestic AI initiatives like Tsuzumi are not aiming to “win” but rather to gain insights from global leaders or use their development as a marketing tool.

For those of us in the IT field, this perspective may seem obvious, but I suspect many users might have a different view. There’s no need to be at the cutting edge. Instead, we can calmly observe overseas technologies and case studies, identify what needs to be done, and walk the well-paved paths that global pioneers have already struggled to create.

At this point, Japan is many laps behind, so there’s no point in trying to catch up. It’s enough to simply move forward at our own pace from where we currently stand.

That said, IT professionals are partly to blame for creating unrealistic expectations by using buzzwords like “cutting-edge” or “latest technology” as part of sales pitches. This is particularly evident in some of the commentary surrounding Tsuzumi. It’s important for Japan’s IT industry to maintain a humble attitude, appreciating the hard work and expertise of overseas leaders whose technologies and methods we are fortunate to utilize.

Still, thinking about the people involved in developing Tsuzumi leaves me with mixed feelings. It must have been a difficult challenge—delivering results within a limited budget in what felt like a losing battle. Perhaps the developers were purely driven by technical curiosity, which kept them motivated despite the odds.

AI projects often involve an overwhelming amount of uncertainty, requiring teams to push forward without clear answers about what’s meaningful or where the true value lies. It’s a mentally taxing process, almost like a form of spiritual training. I don’t know the details of how Tsuzumi was developed, but I wonder what the atmosphere was like in the development team.

Meaning of ‘Burikasu’(ブリカス) Japanese Explanation

This article is an English explanation of Japanese slang, created by Japanese studying the English.

Have you heard the word of ‘brikasu’ or ‘burikasu’, ブリカス?

This word is Japanese net slang, It’s made from Britain ブリテン and Kasu カス meaning trash.

Japanese use this word when despising UK’s people. But this word have respect and mercy, fear for British people.

Therefor this sang was establised in very complex background.

 

Usualy japanese refer to UK as English (イギリス)

Almost all of Japanese call UK as English  (イギリス).

This is grammatically incorrect, but has been customary since the Meiji era.

Why word of  was used Britain in “brikasu” ブリカス ?

The reason is that this term was created from the Japanese people’s historical perception of the UK.

Japanese students were taught history with the belief that Britain has influenced every country through a sense of responsibility and ego, seeing itself at the center of the world, and with shrewdness.

Japanese believe there are no country that has not been influenced by the British, including Japan.

In particular, they believe that Britain is responsible for the uprisings in Asia, such as the Opium War and the Sepoy Rebellion, as well as several wars in the Middle East.

Thus, the image of Britain (not the UK) as a dark player in history became associated with the nation.

 

The word “brikasu” ブリカス started to be used on internet forums

History buffs and conspiracy theorists claimed on the online forum 5チャンネル that every problem was Britain’s fault.

They started using “brikasu” ブリカス used to mean UK as Internet slang or meme.

However, as I have explained, this word is not just a derogatory term, but a term of endearment with many meanings.

Like this “brikasu” ブリカス, There are mane Japanese net slang oy memes related to the UK.

Among them are the following words.

  • 英国面 “eikokumen” : A Japanese person infatuated with Britain, made up from the dark side 暗黒面 of Star Wars.
  • 紅茶でキメる “tripping on tea”: The British are under the impression that drinking tea makes them feel high and less tired.

If a Japanese person uses the word “brikasu” ブリカス, please do not consider it just a derogatory term.

 

Memo of self Introduction

Memo of self Introduction for first meeting  with abroad company’s   person

 

I will be participating in this project from next year.

I started to work for this company since April of this year. and be assigned to same department as him in October.

Until then I was in university. I make neural network model imitated human brain and researched about mechanism of memory

Now Usually, I’m working as member of product development team.

However, I wanted to get some experience in working on project with people outside company. So I’ve been invite to join.

There’s still so many things I don’t know, but I want to communicate with you, I would appreciate your advice.

The reality of japanese school club “Bukatsu” written by university student

Did you hear about “Bukatsu(部活、ぶかつ)“?

If you watch japanese animes or daramas, you may have know this word.

In these story, japanese students are always practice sports or instruments, paintings in “bukatsu”. And, There are no descriptions that they stydy hard.

Do you think it is true?

In fact, main tasks imposed on japanese students is “bukatsu”, although they need to study too.

In my case, I went to groung at early morning and practiced football untill class began.

In the daytime, I offen falled asleepe at class.

After class. I needed to practice futball for hours. Even if sun goes down, We kept playing.

Now I remember that we use fluorescent painted ball, becouse of this ball we could continue to play in the building lights.

We were allowed to take a rest once a month otherwise we did “bukatsu” everyday.

Now if I think about it, japanese teacher is a harsh job, They need to prepare classes and instruct students in “bukatsh” outside working hours.

This is my past story

 

 

Allmost all junior high school students and half of high school students belong to “bukatsu”

Maybe, you think that I always practiced football becoused I was a lazy student.

You might be right, but I continue to study in the master’s course of national university now and would rather like to learn.

What is reason I alweys did “bukatshu”?

It is that I was forsed join “bukatsh” as the japazese education custom.

In coutryside, there is the rule junior high school studentIn needs to belong any team of “bukatsu”.

Of course , I have wanted to play football. I did’t want to play football everyday.

 

 

 

Several reason to enter “bukatsu”

What about high school students?

They don’t hve to enter club team of “bukatsu”, however students have various reason to do “bukatsu”

People who have continued to belong to a sports club to do “bukatsu” are caled “taikukaikei(体育会系、たいいくかいけい)

They are thought to  have strength and patient for irrational situation and submissive to their senior

Especially, large corporation tend to prefer “taikukaikei”.

Therefore, Student who did “bukatsu” can entry the famous university and company. However they have needed to achieve good result so they parctice very hard.

In my case, I was not a good plyer enough to entry universuty.

I belonged to “bukatsu” because I have nothing I want to do but “bukatsu(football)”. I was forced believe that if I am better player, I can live fulfilling life.

That is not just my things, It applies to another students who entering “bukatsu”

These people have thought that own value was determined with skills in “bukatsu” and then, they will realise it is impossible to improve the level of happiness after quit “bukatsu” like me.

 

 

Special school for graduates who had been entered “bukatsu” in high school days

I only did “bukatsu” until third-year student therefore there were no school  I could pass the exams.

In japane, many student who did’t study well enter special school to prepare the entrance examinations. This school name is “yobikou(予備校、よびこう)”.

This school exist in a big city. Students who have not study and belong to “bukatsu” continue studying to prepare entrance examinations of university for 1 year or more in “yobikou”.

In japanese, to attend “yobikou” is called  “rounin(浪人)”. After I had studed in “yobikou”,  I entered university I have been now.

 

 

Is “Bukatsu” just outdate custom?

Recently, The number of people who belong to “bukatsu”is decreased because emphasis have been given in the individual independence of students.


However, It is also true that students can learn many thing from “bukatsu”. And I think enthusiasm for something not studying has worth for life. 

Spending there life to “bukatsu” is might valid for education, if they can face their days in “bukatsu” actively

Hello, world!

I am japanese graduate student belong to the University of Electro-Communications (UEC) in Tokyo, Japan.

I am studing and researching about neural network. But I am supposed to graduate this university and start working for system developer as engineer in Appril

The reason I start building this blog is to post infomation on my mind. The pickerlab.net is my blog in japanese. I have written samethings about student life, job hunting. Reasently, I have been thinking about posting article abrosd.

Becouse all my post in pickerlab.net are written in Japanese, It was nessesary to Written in English so that people visit blog From abroad.

I plan to write about experience of student life and research on labratory.