Wednesday, April 27, 2022

Federal Court Okays Web Scraping


I've scraped many a website in my time.

"Web scraping," in case you're wondering, refers to the compilation of information that's published on websites.

A few recent examples:
  • This week, I wrote a blog post on the birth of art history by web scraping. 

  • Earlier this month, I taught a brainy high schooler how to "speed write" a term paper by web scraping. In the process, I learned more about Shirley Jackson than you'd ever want to know.

  • Last November, I built an e-mail list of insurance executives for a client who had no marketing list.
Web scraping isn't theft.

Theft is what Instagram coach Kar Brulhart faced this week, when a rival ripped off her ideas verbatim and presented them as his own.

Web scraping is research, as a federal court ruled last week.

In the decision, the US Ninth Circuit ruled against LinkedIn, which sued hiQ Labs, a research firm that studies employee attrition.

LinkedIn wanted HiQ to stop scraping its users' profiles.

The court ruled that web scraping doesn't violate the Computer Fraud and Abuse Act, which defines computer hacking under US law.

Hacking is defined as "unauthorized access to a computer system;" but scraping snatches public data.

The case had reached the Supreme Court last year but was sent back to the US Ninth Circuit for review.

The court's decision is a "major win for archivists, academics, researchers and journalists who use tools to mass collect, or scrape, information that is publicly accessible on the Internet," says Tech Crunch.
Powered by Blogger.