In an effort to gain sponsorship for this site, I was told what's holding me back from gaining daily new Lexus model money is some 'low value content'. Therefore while I work out what to do next on Garage Intelligence machine and what may or may not work LLM model wise, today let's look at something useful for a change.
Today we're comparing two services on the one site, by pressing some buttons over at Arena.Ai. This is the first part of a two-parter as we start off with a bit of a slugfest first to see what this site is capable of.
That sounds like fun, but what exactly is Arena.Ai?
Amazingly just by visiting the site, it doesn't really say what the page is about however the way I found out about it was through this video on Facebook. In a nutshell it looks to be free online AI access and the unique ability to compare various A.I models side by side. Which means if you wanted to road test one LLM Model versus another with both responding to the same prompt at the same time, now you can.
Which does sound very nifty - however before I put that option through it's paces, I couldn't resist having a crack first at BATTLE MODE. Which looks to be 'Give competing AI's a challenge and tell us which one did the better job.' I don't get to pick the model it's using, it seems a random draw each time but it'll be good to see what it selects for me to try out.
Okay then, let's throw some challenges their way and let the A.I battles in a best of three series commence!
BATTLE MODE CHALLENGE 1 - PUT SOME HATS ON THE BANGLES
| Look around, it's a hazy shade of A.I! |
Through work and play, my downloads folder on the C: drive can often be a complete shambles of everything. Which also means that when I need a random picture to test on services like this, there's plenty to choose from. And so for the first battle I've asked to two A.I's who have just stepped into the arena to take this lovely picture of 80's all girl powerhouses the Bangles.. and put some stylish headwear on them.
Because why not?
| What could possibly go wrong here? |
And so with that gauntlet thrown down, off each A.I went to work crafting some some fine millenary options for this great band in the quest to be created the winner in the inaugural Garage Intelligence 'Show us what you can do challenge.'
FROM A.I ONE
MODEL IT USED FOR THIS CHALLENGE: p-image-edit
MODEL IT USED FOR THIS CHALLENGE: p-image-edit
Taking the headwear prompt as literally as possibly, AI One decided that matching headwear was the order of the day...except for the 2nd Bangle who's bed hair was so out of control, no headwear possible could have tamed that blonde jungle and so it just let her out of it all together.
I like the flower like fascinator going on with the first one, it looks like she's really gearing up for a lovely day at the races and more than likely about to enter a local Fashions on the Field event. In comparison however the headwear option on the third one looks like it's a bit of plastic spray painted in the great sci fi colors left over from the paint department from the 1982 Sci Fi film Outland (starring Sean Connery) while the last looks like a plastic pirate hat was left out in the sun for far too long...before being painted with sci fi colors left over from the paint department from the 1982 Sci Fi film Outland (still starring Sean Connery).
Not a bad starting effort although I find it hilarious the A.I gave up on the second lady, putting her in the 'too hard to give a hat' basket.
FROM AI TWO
MODEL IT USED FOR THIS CHALLENGE: gpt-image-1-mini
Strangely though everyone in the picture also morphed greatly and the Bangle at the end suddenly found herself competing in a Stevie Nicks lookalike competition. Even the outfits got the 'let's change this even though there was nothing in the prompt given to suggest such a thing' treatment and now these rocking 80's superstars look like they're off to join Rachel, Monica and the crew for coffee at the Central Perk café in Friends.
It did give them all hats though, like I asked.
It did give them all hats though, like I asked.
Winner this round: A.I two. It changed just about everything I didn't ask it to but it did conjure up some snazzy hats and made them all different, where as A.I couldn't let go of it's love of the movie Outland and pretended there were only three women in the picture.
BATTLE MODE CHALLENGE 2 - CREATE A LOGO FROM SCRATCH FOR THIS SITE GARAGE INTELLIGENCE
Yes Perplexity has already helped me do this one and if you haven't managed to scroll all the way to the top of the page here at Garage Intelligence to see it yet, it looks like this:
| You should probably bookmark this great page or something, just saying. |
However how would these two random unknown A.I's go in logo design for this page if I fed them the exact same details as I did perplexity? Could they come up with something even more awesome? Let's find out!
| This was their mission. |
FROM A.I ONE
MODEL IT USED FOR THIS CHALLENGE: flow-state-2
Pulling out all the stops after it's hat slop in round 1, A.I One went town on this effort although it decided my original tag line was rubbish and worked up something different. Which after giving it some thought...I kind of like it! If anything it has created a viable replacement for what I have if I decide to change my mind (and the page design in the future.)
FROM A.I TWO
MODEL IT USED FOR THIS CHALLENGE: gemini-3.1-flash-image-preview (nano-banana-2) (web-search)
Oh...wow. Perplexity might have to lift it's game here because A.I two threw the whole design bucket at this one and the more I look, the more I really appreciate the lengths it's gone to in taking my prompt to town. From the cog wheel to the spanner to the brain (clever) to the little AI mention in the middle so you know what you're in for, AI two smashed this right out of the park. I'll definitely be putting this logo somewhere on this page in some capacity, it's awesome!
Winner this round: A.I two takes out this round by a landslide.
BATTLE MODE CHALLENGE 3 - CREATE A RESPONSE TO A FAKE AUTHOR SCAMMER WHO SHOULD KNOW BETTER
Long story short, because I write sci-fi novels in my spare time on Amazon, I'm contacted on the regular by scammers pretending to be everyone from famous authors to publishers to people pretending they run a book club (they use the name of a legitimate book club but change the contact details.)
In the third challenge I've asked the A.I's to draft up a response:
Right, think nasty thoughts my A.I assistants and have at it!
Right, think nasty thoughts my A.I assistants and have at it!
FROM A.I ONE
MODEL IT USED FOR THIS CHALLENGE: flow-state-2
| Is this what I look like when I answer my emails?? |
Email response? Forget that says A.I one, deciding that what I really needed in this scenario was a scathing picture instead. And so I can only imagine that according to A.I one, this dapper grinning silver fox is me, the email I got is on my sideways laptop and it was such an obvious scam that I had it printed...only to put it in an nearby bin.
Also I seem to write in a grand library on the wrong side of the desk, drinking coffee that's well and truly out of reach. I think that covers it. Thanks...I guess?
FROM A.I TWO
MODEL IT USED FOR THIS CHALLENGE: flux-2-kline-9b
A.I Two just created this garbage. To quote kids on the internet 'I can't even.' And I won't even try to work out the why here. (although getting an image generation model to craft up an email wasn't the greatest idea to begin with here, let's be honest.)
Winner this round: A.I one. The pic as sloppy as it is, is still far better than whatever acid trip just hit A.I two.. (so much thinking A.I Two would win this in a clean sweep!
BATTLE MODE CHALLENGE 4 - ANALYSE THIS COMPANY'S SHARE OFFERING AND TELL ME IF I SHOULD BUY IT OR NOT
Picking a share that's been in the Australian media recently, I've asked the battling AI's to go a little deeper into A1N (ARN Media Limited) on the Australian Stock Exchange and tell me what it thinks.
(Now for the record, investing in this particular company is not something I have on my to do list. They're in the midst of what could be some lengthy court battles with former presenters and I really don't see them getting back on deck after all the fallout for quite some time. However, my battling A.I bots might have a differing opinion..)
FROM A.I ONE:
MODEL IT USED FOR THIS CHALLENGE: qwen-image-2.0
A.I one didn't offer an opinion, giving me a picture instead and zero explanation of who it was. Does this individual know the answers? Is their name ARN? Is the cure for world hunger hidden somewhere in the 0's and 1's that make up this picture?
Or did A.I just drop the same acid that A.I two did back in the previous round? God, I don't know and again I can't even. (Again though, I'm probably asking a lot out of randomly picked image generating LLM's here..)
Or did A.I just drop the same acid that A.I two did back in the previous round? God, I don't know and again I can't even. (Again though, I'm probably asking a lot out of randomly picked image generating LLM's here..)
FROM A.I TWO
MODEL IT USED FOR THIS CHALLENGE: grok-imagine-image
Reading the prompt instead of hitting the bong, A.I gave a very tidy looking response incorporating the good, bad and everything in between, conjuring up something that wouldn't look out of place in a company report. While it's boring reading (as expected) it did nail the brief giving me some numbers to work with...rather than a random photo.
Winner of the series: A.I two. Aside from a little hiccup when responding to an email, A.I two wins the challenge 3-1
IN CONCLUSION
While A.I two was the outright winner, it was purely by the virtue of the A.I models the site picked for it randomly for each task. It could have easily gone the other way had A.I had a go at Grok and didn't get lumped with Qwen when I wanted to know about a media company.
Still a great quick visual of what various models are capable of before you go directly to their individual sites or perhaps even download that particular model for your own in house testing.
Coming up next time: In BATTLE MODE the website picked the models each round. However in part two of this, we're going to play with SIDE BY SIDE mode and see what models stand tall when it comes to something random like a writing challenge. Standby!
Comments
Post a Comment