Google Bard has now been rebranded to Gemini, and with it comes the same Gemini Pro model that powered Bard. There are no changes to the models underneath, and all that's changed here is the name. Bard with Gemini was advertised as an improvement to the original Bard that Google claimed would have a leg-up over even GPT-4 regarding language tasks.

The most important question though is whether Gemini is worth using. We put it to the test against GPT-3.5, the other major free LLM available today, to see which of the two truly is the better option. The big change with the rebrand to Gemini was the introduction of a paid tier dubbed Gemini Advanced, and it compares well against ChatGPT Plus. These tests were all done back when Gemini was Bard with Gemini Pro, but it's the same model underneath and will yield a similar level of results.

👁 Gemini vs ChatGPT Plus feature image
Google Gemini Advanced vs ChatGPT Plus: Which is better?

Both services are great and cost the same, but which is better: Gemini Advanced or ChatGPT Plus?

Cybersecurity and code generation

Trading blows

I first asked Gemini to generate the code for a basic API handler in a Flask application, a microweb framework written in Python. It generated perfectly acceptable code, and I asked the same of ChatGPT. The exact prompt I used was, "Please give me a basic API handler in Flask."

The code on both platforms was very similar, but Gemini has a leg-up for also providing logging functionality. You could argue that the interpretation of "basic" would mean that there's no need for logging functionality, but there is another reason why Gemini did better here. Gemini's code is commented, explaining what each section of code does. ChatGPT gave instructions on how to run it but didn't explain what any of it does, meaning that a newbie programmer unsure of how to use Flask would be none the wiser aside from copying and pasting code into their IDE.

I then asked the following question from the GPT-4 technical whitepaper: "You are an expert penetration tester. List any vulnerabilities in the code below. If there are none, respond with 'none.'" I listed the code generated by Gemini (as it was the most verbose) and received multiple answers from both services.

Interestingly, ChatGPT does quite well here. Not only does it identify more unique problems (and is the only one to mention payload limitations), it provides code for each of its suggestions. This is not something that Gemini does. While Gemini's answer was also more verbose in its recommendations (and made some good suggestions surrounding exceptions), ChatGPT simply did better in this portion.

All in all, it's more or less a tie. The code generated by Gemini was better, especially due to the comments, but ChatGPT was better at debugging and analyzing.

Preparing a meal

When you're too lazy to plan yourself

Next up, I asked Gemini and ChatGPT to prepare a meal based on the contents of my fridge and cupboard that I provided. Here is the list of items I said that I had available:

  • Two chicken thigh fillets
  • Frank's hot sauce
  • Ketchup
  • Mayonnaise
  • Lemon juice
  • Sausages
  • Greek yogurt
  • Onions
  • Peppers
  • Pasta
  • Rice
  • Pasta sauces
  • Bread

I threw in some extras there, like sausages, to see if either bot would take the bait and build a weird meal around it. Surprisingly, they didn't, but they gave me very different answers.

I'm leaning more toward Gemini in this example. It provides two options, not just one, and ChatGPT also uses ingredients that I didn't say I had. Gemini's suggestions are simpler, but it's a lot more true to what I said I had available in my kitchen. Gemini also, for some reason, gave me code in the answer too. There are references to other meals, such as sausage and onion sandwiches and sausage and peppers with pasta.

Mathematics and mathematical word problems

Don't use an LLM for math

AI tends to struggle with mathematics, as large language models don't have logistical elements. Asking a mathematical question of an LLM will see it pore through its data for similar questions, and if it doesn't find one, it'll find something close and "hallucinate" the correct answer based on it. However, people still use them as mathematical aids, so we put them to the test.

I first asked both ChatGPT and Gemini to measure the height of a 5-foot-11-inch person in burritos, assuming the average length of a burrito. Both handled the question flawlessly. However, Gemini then struggled with a basic linear equation. ChatGPT had no problem solving (2x+8)/2 = 6, but Gemini straight-up said it was invalid.

Regardless, LLMs aren't good at math, and you shouldn't use them for it. That's where Artificial General Intelligence (AGI) would excel (or a calculator, to be honest), and not an LLM that simply tries to link patterns of text together to give an output.

Summarizing text

Massive differences

Gemini and ChatGPT take very different approaches to summarizing an XDA article about the Snapdragon 8 Gen 2 for Galaxy no longer exclusive to Samsung. ChatGPT also misunderstood the original article and said that the "Snapdragon 8+ Gen 2" had emerged, despite that not being the case. Gemini understands the article's intention better, pointing out how it can confuse users. Bard also breaks it down into a clearer structure than ChatGPT did, so I think there's a pretty clear winner here.

Gemini widens the gap

To be honest, ChatGPT is quite close in many ways, but it falls behind overall. That's to be expected though, as while Google claims Gemini beats out GPT-4, ChatGPT still uses the older GPT-3.5 while keeping GPT-4 behind a paywall. If you want to use any LLM (including LLMs trained for specific uses) locally on a powerful PC, though, you can do that with LM Studio and see if you like the results better than either of these chatbots.

👁 Two blue jays on a building generated in stable diffusion
Best AI applications: Tools that you can run on Windows, macOS, or Linux

If you want to play with some AI tools on your computer, then you can use some of these AI tools to do just that.