🔥 Today Only: Save 30% on Premium — Offer Ends Soon! - Upgrade Now!
Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér.

In this project, OpenAI built a hide and seek game for their AI agents to play.

While we look at the exact rules here, I will note that the goal of the project was to pit

two AI teams against each other, and hopefully see some interesting emergent behaviors.

And, boy, did they do some crazy stuff.

The coolest part is that the two teams compete against each other, and whenever one team

discovers a new strategy, the other one has to adapt.

Kind of like an arms race situation, and it also resembles generative adversarial network

a little.

And the results are magnificent, amusing, weird - you’ll see in a moment.

These agents learn from previous experiences, and to the surprise of no one, for the first

few million rounds, we start out with…pandemonium.

Everyone just running around aimlessly.

Without proper strategy and semi-random movements, the seekers are favored and hence win the

majority of the games.

Nothing to see here.

Then, over time, the hiders learned to lock out the seekers by blocking the doors off

with these boxes and started winning consistently.

I think the coolest part about this is that the map was deliberately designed by the OpenAI

scientists in a way that the hiders can only succeed through collaboration.

They cannot win alone and hence, they are forced to learn to work together.

Which they did, quite well.

But then, something happened.

Did you notice this pointy, doorstop-shaped object?

Are you thinking what I am thinking?

Well, probably, and not only that, but about 10 million rounds later, the AI also discovered

that it can be pushed near a wall and be used as a ramp, and, tadaa!

Got’em!

Te seekers started winning more again.

So, the ball is now back on the court of the hiders.

Can you defend this?

If so, how?

Well, these resourceful little critters learned that since there is a little time at the start

of the game when the seekers are frozen, apparently, during this time, they cannot see them, so

why not just sneak out and steal the ramp, and lock it away from them.

Absolutely incredible.

Look at those happy eyes as they are carrying that ramp.

And, you think it all ends here?

No, no, no.

Not even close.

It gets weirder.

Much weirder.

When playing a different map, a seeker has noticed that it can use a ramp to climb on

the top of a box, and, this happens.

Do you think couchsurfing is cool?

Give me a break!

This is box surfing!

And, the scientists were quite surprised by this move as this was one of the first cases

where the seeker AI seems to have broken the game.

What happens here is that the physics system is coded in a way that they are able to move

around by exerting force on themselves, but, there is no additional check whether they

are on the floor or not, because who in their right mind would think about that?

As a result, something that shouldn’t ever happen does happen here.

And, we’re still not done yet, this paper just keeps on giving.

A few hundred million rounds later, the hiders learned to separate all the ramps from the

boxes.

Dear Fellow Scholars, this is proper box surfing defense…then, lock down the remaining tools

and build a shelter.

Note how well rehearsed and executed this strategy is - there is not a second of time

left until the seekers take off.

I also love this cheeky move where they set up the shelter right next to the seekers,

and I almost feel like they are saying “yeah see this here?

there is not a single thing you can do about it”.

In a few isolated cases, other interesting behaviors also emerged, for instance, the

hiders learned to exploit the physics system and just chuck the ramp away.

After that, the seekers go “what?”

“what just happened?”.

But don’t despair, and at this point, I would also recommend that you hold on to your

papers because there was also a crazy case where a seeker also learned to abuse a similar

physics issue and launch itself exactly onto the top of the hiders.

Man, what a paper.

This system can be extended and modded for many other tasks too, so expect to see more

of these fun experiments in the future.

We get to do this for a living, and we are even being paid for this.

I can’t believe it.

In this series, my mission is to showcase beautiful works that light a fire in people.

And this is, no doubt, one of those works.

Great idea, interesting, unexpected results, crisp presentation.

Bravo OpenAI!

Love it.

So, did you enjoy this?

What do you think?

Make sure to leave a comment below.

Also, if you look at the paper, it contains comparisons to an earlier work we covered

about intrinsic motivation, shows how to implement circular convolutions for the agents to detect

their environment around them, and more.

Thanks for watching and for your generous support, and I'll see you next time!

Please play the YouTube video first

OpenAI Plays Hide and Seek…and Breaks The Game! 🤖


Leave a Reply

Your email address will not be published. Required fields are marked *

We have detected unusual activity on your device.
Please verify your identity to continue.
Note: This verification step won't sign you in. If you have a premium account, please log in to access the service as usual.
Google/Gmail Verification
Or verify using Email/Code
We've sent a verification code to:
youremail@gmail.com (Not your email?)
Enter it below to complete the verification process.
Ensure your email address is correct, your inbox is not full, and you check your spam folder. If no email arrives, consider using an alternative email.
You will need a Premium plan to perform your action!
Note: If you already have a premium account, please log in to access our services as usual.

Plans & Pricing

Our mission is to make quality education accessible for everyone.
However, to keep our hardworking team running and this service alive, we genuinely need your support!
By opting for a premium plan, not only do you sustain us in achieving the mission, but you also unlock advanced features to enrich your learning experience.

Free

For learners who aren't pressed for time

What's included on Free
100+ Cambridge IELTS Tests
Instant IELTS Writing Task 1 & 2 Evaluation (2 times/month)
Instant IELTS Speaking Part 1, 2, & 3 Evaluation (5 times/month)
Instant IELTS Writing Task 1 & 2 Essay Generator (2 times/month)
500+ Dictation & Shadowing Exercises
100+ Pronunciation Exercises
Flashcards
Other Advanced Tools

Premium

For those serious about advancing their English proficiency, and for IELTS candidates aspiring to boost their band score by 1-2 points (especially in writing & speaking) in just 30 days or less

What's included on Premium
Save Your IELTS Test Progress
Unlock All Courses & IELTS Tests
Unlimited AI Conversations
Unlimited AI Writing Enhancement Exercises
Unlimited IELTS Writing Task 1 & 2 Evaluation
Unlimited IELTS Speaking Part 1, 2, & 3 Evaluation
Checked Answers Will Not Be Published
Unlimited IELTS Writing Task 1 & 2 Essay Generator
Unlimited IELTS Speaking Part 1, 2, & 3 Sample Generator
Unlimited Usage Of Advanced Tools
Priority Support within 24h (12-month plan only)

Due to the nature of our service and the provided free trials, payments are non-refundable.
Nếu bạn là người Việt Nam và không có hoặc không muốn trả bằng credit/debit cards, bạn có thể thanh toán bằng phương thức chuyển khoản:



Chọn gói:
419,000₫ 277,000 ₫ cho gói 1 tháng (chỉ 9,233₫/ngày)
1,239,000₫ 597,000 ₫ cho gói 3 tháng (chỉ 6,633₫/ngày)
2,469,000₫ 1,027,000 ₫ cho gói 6 tháng (chỉ 5,706₫/ngày)
4,929,000₫ 1,417,000 ₫ cho gói 12 tháng (chỉ 3,936₫/ngày)


Sau khi chuyển khoản, vui lòng đợi trình duyệt tự động điều hướng bạn trở lại Engnovate và bạn sẽ ngay lập tức nhận được mã kích hoạt tài khoản premium.
Nếu có lỗi xảy ra, bạn có thể liên hệ với team thông qua một trong các phương thức: email đến helloengnovate@gmail.com hoặc nhắn tin qua facebook.com/engnovate.
Vì toàn bộ công cụ trên website đều có thể sử dụng thử miễn phí, Engnovate không hỗ trợ hoàn tiền.