Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
AI

Cheap AI 'Video Scraping' Can Now Extract Data From Any Screen Recording (arstechnica.com) 13

An anonymous reader quotes a report from Ars Technica: Recently, AI researcher Simon Willison wanted to add up his charges from using a cloud service, but the payment values and dates he needed were scattered among a dozen separate emails. Inputting them manually would have been tedious, so he turned to a technique he calls "video scraping," which involves feeding a screen recording video into an AI model, similar to ChatGPT, for data extraction purposes. What he discovered seems simple on its surface, but the quality of the result has deeper implications for the future of AI assistants, which may soon be able to see and interact with what we're doing on our computer screens.

"The other day I found myself needing to add up some numeric values that were scattered across twelve different emails," Willison wrote in a detailed post on his blog. He recorded a 35-second video scrolling through the relevant emails, then fed that video into Google's AI Studio tool, which allows people to experiment with several versions of Google's Gemini 1.5 Pro and Gemini 1.5 Flash AI models. Willison then asked Gemini to pull the price data from the video and arrange it into a special data format called JSON (JavaScript Object Notation) that included dates and dollar amounts. The AI model successfully extracted the data, which Willison then formatted as CSV (comma-separated values) table for spreadsheet use. After double-checking for errors as part of his experiment, the accuracy of the results -- and what the video analysis cost to run -- surprised him.

"The cost [of running the video model] is so low that I had to re-run my calculations three times to make sure I hadn't made a mistake," he wrote. Willison says the entire video analysis process ostensibly cost less than one-tenth of a cent, using just 11,018 tokens on the Gemini 1.5 Flash 002 model. In the end, he actually paid nothing because Google AI Studio is currently free for some types of use.

Cheap AI 'Video Scraping' Can Now Extract Data From Any Screen Recording

Comments Filter:
  • I do similar things (Score:5, Interesting)

    by SirSlud ( 67381 ) on Friday October 18, 2024 @05:01PM (#64875571) Homepage

    I take screenshots of a bunch of web pages and then just describe to the MML what it's looking at, and how I'd like it combined, arranged, formatted (in markdown, to boot) It's rather impressive how well it gets stuff like that right off the bat. Took a task I used to hate to do, now it takes me a 1/10th of the time, if that. It wouldn't surprise me it works equally well with video, although maybe how cheap it is to do is notable.

  • Obviously, you sometimes simply will get a wrong result on top as a bonus. I mean, we are now using "AI" to add numbers?

    • Obviously, you sometimes simply will get a wrong result on top as a bonus. I mean, we are now using "AI" to add numbers?

      Reminds me of the Google analytics chart showing how many people asked "What's the number for 911?" -- which apparently wasn't a joke.

  • by Kelxin ( 3417093 ) on Friday October 18, 2024 @05:12PM (#64875597)
    This has been happening for over a year. Let me know when AI can watch porn with me and suggest new models in similar tastes.
    • by Hodr ( 219920 )

      GoogleyMoogley AI has finished watching all 927 hours of pornographic content on your mobile device and suggests you.......take a seat over there.

    • Meaning the people 'selling' porn must not benefit from an AI tool that matches consumers with appropriate content.

      Not sure if that's the websites themselves would lose out (or they don't see value in attempting it for the expected costs)... Or the content creators freak out and leave. Or what.

      Or maybe the people who could fund something like that haven't decided to? Meaning even in 2024 we seem to have a lot of people who ignore stuff like violence, lack of food/water/housing, etc... but freak out abou

  • by thesjaakspoiler ( 4782965 ) on Friday October 18, 2024 @07:11PM (#64875855)

    An AI distorting you for more energy and compute power, Microsoft Recall will deliver it in 2025!

  • Willison then asked Gemini to pull the price data from the video and arrange it into a special data format called JSON (JavaScript Object Notation) that included dates and dollar amounts. The AI model successfully extracted the data, which Willison then formatted as CSV (comma-separated values) table for spreadsheet use.

    I wonder if he could have taken a second video recording of the JSON result set and asked the AI model to then convert it for him as the desired CSV format...

  • I guess it gave you the data you could 'step by step' look at and prove to yourself how well it worked. But I thought the point was to just explain what we wanted to know, and it'd try to do that for us. Especially when something is already in a text format (like emails), it feels b0rken to take recordings of them and feed that into a computer based tool.

    Is this just to work around the 'I cannot prove/limit what you will use of a large data source, so I will artificially limit what data you can see instea

"Roman Polanski makes his own blood. He's smart -- that's why his movies work." -- A brilliant director at "Frank's Place"

Working...