YouTube

This helps protect our community. Learn more

Why are people using Blue Sky?

33Likes

926Views

Dec 42024

Full podcast: youtu.be/i1NlQixGW2I Now that the seal is broken on scraping Bluesky posts into datasets for machine learning, people are trolling users and one-upping each other by making increasingly massive datasets of non-anonymized, full-text Bluesky posts taken directly from the social media platform’s public firehose—including one that contains almost 300 million posts. Last week, Daniel van Strien, a machine learning librarian at open-source machine learning library platform Hugging Face, released a dataset composed of one million Bluesky posts, including when they were posted and who posted them. Within hours of his first post—shortly after our story about this being the first known, public, non-anonymous dataset of Bluesky posts, and following hundreds of replies from people outraged that their posts were scraped without their permission—van Strein took it down and apologized. This is a production of 404 Media, a journalist-owned tech website. Learn more and subscribe at: htttps://404media.co Listen to our weekly podcasts: Apple Podcasts: https://podcasts.apple.com/us/podcast/the-404-media-podcast/id1703615331?ref=404media.co Spotify: https://open.spotify.com/show/0F3oY47l2XgoBMaAmIaw29?ref=404media.co Google Podcasts: https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5hY2FzdC5jb20vcHVibGljL3Nob3dzL3RoZS00MDQtbWVkaWEtcG9kY2FzdA?ref=404media.co Become a paid subscriber for access to bonus content: https://404media.co/membership

Why are people using Blue Sky?

Comments 2

Description

Why are people using Blue Sky?