It is a reoccurring issue in data privacy litigation—a plaintiff commences litigation challenging applications of new technology and raising various claims concerning decades-old data privacy laws that predated the technology at issue.  Such is the case of recent data scraping litigation, addressed in greater detail below.

What is data scraping?  Good question.  To generalize, it is a mechanism of extracting data from websites (including websites not available to the public and accessible only to individuals with user accounts).  The practices of Clearview which has been the subject of recent litigation are a prime example.  By compiling information scraped from the social media accounts of billions of individuals, Clearview was able to create a massive facial recognition database it subsequently provided to third party customers.  However, notwithstanding the clear privacy issues implicated by data scraping, there is no law specifically regulating this practice nationwide (although some state laws, as CPW has already covered, regulate the collection of biometric data).  As such, in litigation regarding data scraping, parties are stuck arguing over the application of various statutes that were enacted long before data scraping was prevalent.

As just one example: To address the growing problem of computer hacking, in 1984 Congress passed the Computer Fraud and Abuse Act (the “CFAA”), creating criminal and civil liability for a party who accesses a computer without authorization or in a manner exceeding their authorization.  To prevail on a civil CFAA claim, a plaintiff typically must demonstrate that a defendant intentionally accessed a computer without authorization or exceeded the authorized access, and thereby obtained information from a protected computer.  The CFAA has been extensively litigated, although courts have not interpreted its provisions consistently.  This is true including in regards to data scraping.  While courts usually apply the CFAA in manner that protects a website’s publicly available data against third-party unauthorized access, courts have also formulated various standards to determine whether a third party’s access to a website was without authorization or exceeded authorized access in violation of the CFAA.

This is because, among other things, the CFAA prohibits intentionally accessing a protected computer “without authorization” or in a manner that exceeds the authorized access, and obtaining information from such a computer.  The CFAA defines “protected computer” broadly, and includes every computer connected to the Internet.  The CFAA also prohibits knowingly and with intent to defraud, accessing a protected computer without authorization, or exceeding authorized access, and by means of such conduct furthering the intended fraud and obtaining anything of value.  18 U.S.C. Section 1030.  Importantly, however, the CFAA however, does not define the term “without authorization”.  This ambiguity in the statute has led to a split among the federal appeals courts regarding how the condition of “without authorization,” as used in the CFAA, should be applied in the context of data scraping.  While some circuit courts have broadly looked to whether collecting data from a website violates a website’s terms of use or service, other courts have more narrowly interpreted the condition to require the technical circumvention of some kind of code-based access restriction.

For instance, last year the Ninth Circuit in hiQ Labs, Inc. v. LinkedIn Corp., 938 F.3d 985 (9th Cir. 2019), addressed under what circumstances a company may legally “scrape” data from another company’s website.  There, the court determined on a motion for a preliminary injunction that “scraping” publicly available information from LinkedIn likely is not a violation of the CFAA because the LinkedIn computers are publicly accessible.  As such, hiQ did not access the computers “without authorization” as required by the CFAA.  The Second and Fourth Circuits follow this interpretation of the CFAA as well.

This approach is far from uniform, however.  Sw. Airlines v. Farechase, 318 F. Supp. 2d 435, 439-40 (N.D. Tex. 2004) (finding that a plaintiff plausibly alleged a CFAA claim when Southwest “directly informed” the defendant that its scraping activity violated the Use Agreement on Southwest’s website, which was “accessible from all pages on the website,” as well as via “direct repeated warnings and requests to stop scraping.”).  The First, Fifth, Seventh and Eleventh Circuits broadly interpret the CFAA to cover violations of corporate computer use restrictions and policies governing authorized uses of databases.

Three years in, the LinkedIn-hiQ battle over data scraping continues in both the Northern District of California, and the Supreme Court of the United States, where LinkedIn’s petition for certiorari is pending.  For those who are not familiar, hiQ filed its initial complaint against LinkedIn in 2017, alleging LinkedIn’s cease-and-desist letters to hiQ, followed by LinkedIn restricting hiQ’s access to its website, was anticompetitive and violated state and federal laws. The crux of hiQ’s complaint was that LinkedIn did not have monopoly rights to personal data made publicly available by its users, and that by scraping its website, hiQ did not violate users’ privacy rights (what LinkedIn alleges).  As mentioned, the Northern District of California granted hiQ’s request for a preliminary injunction against LinkedIn restricting hiQ’s access to publicly available LinkedIn member profiles.  LinkedIn appealed, but the appeal was denied.  LinkedIn then filed a petition for certiorari to the SCOTUS, which is currently pending.

Separate from the preliminary injunction, on September 9, 2020, Judge Chen of the Northern District of California granted in part LinkedIn’s motion to dismiss hiQ’s amended complaint. The Court dismissed all claims under the Sherman Act, the federal antitrust legislation.  Nine separate causes of action remain, including HiQ’s allegation that LinkedIn violated California’s Business and Professions Code (the California antitrust legation).  LinkedIn filed its Answer and Counterclaims on November 20—including counterclaims under, you guessed it, the CFAA.

The specific question pending before the SCOTUS (in hiQ’s words) is: “Whether a professional networking website may rely on the Computer Fraud and Abuse Act’s prohibition on “intentionally access[ing] a computer without authorization” to prevent a competitor from accessing information that the website’s users have shared on their public profiles and that is available for viewing by anyone with a web browser.”  Theoretically, if SCOTUS rules in favor of hiQ, LinkedIn members (and users/members of other similar platform) may lose their ability to control where and with whom their personal information is shared once they have made it public through the platform.  The ruling would also answer the question on who owns rights to user’s “publicly accessible” data.  It is a critical question, and bound to have major impact in the data scraping arena.

So there you have it.  Another day, another interesting development in data privacy litigation.  How this all shakes out in regards to data scraping (and what it means for the millions of individuals whose personal data is the target of such scraping) remains to be seen.  Stay tuned.

 

Join Julia Jacobson and Kyle Dull as they discuss pivotal legal hurdles that businesses face in the complex area of website data scraping, enforcing website terms, and how to stay compliant with the evolving data privacy landscape. The panel will address data scraping fact patterns, recent legal and case law developments, and jurisdictional nuances, as well as offer solutions and safeguards for reducing risks for businesses and data scrapers. This is an evolving area of the law, and businesses should take note of the developments.

Julia and Kyle will discuss these and other key issues:

  • The different ways to capture user consent
  • What are potential legal consequences of unauthorized data scraping?
  • What types of data should not be scraped to avoid risks?
  • What steps should counsel take to ensure compliance with the CFAA, DMCA, and other laws applicable to data scraping?

Date: Thursday, February 16, 2023

Time: 1 – 2:30 p.m. EST

Register at this link.

Please reach out to Julia and Kyle, or your relationship partner at the firm, for more information.

LinkedIn and hiQ Labs agreed to a consent judgment and permanent injunction to resolve all data scraping related claims after six years of litigation. This news follows last month’s summary judgment win by LinkedIn on its breach of contract claim against hiQ, based on a finding that hiQ’s data scraping and use of fake profiles violated LinkedIn’s user agreements. 

Continue Reading LinkedIn’s Data Scraping Battle with hiQ Labs Ends with Proposed Judgment

We have been covering the hiQ-LinkedIn data-scraping saga for several years now on CPW. (See previous posts here, here, here, and here).

After well-publicized litigation that made its way to the Supreme Court and back again, the United States District Court for the Northern District of California ruled[1] that the provisions of a website user agreement that prohibit anti-scraping and fake profiles are enforceable in a breach of contract claim. Businesses should take note and ensure that their own conduct enforces their terms and conditions in order to prevent violators from successfully claiming affirmative defenses. If a business knows of a violation, and wants to have enforceable terms, it should pursue remedying that violation.

Continue Reading Federal Court Rules in Favor of LinkedIn’s Breach of Contract Claim after Six Years of CFAA Data Scraping Litigation

Earlier this week, the Ninth Circuit, yet again, concluded that data scraping public websites is not unlawful. In hiQ Labs, Inc. v. LinkedIn Corp., a case that has been ongoing for nearly five years, the Ninth Circuit affirmed its earlier decision that LinkedIn may not rely on the Computer Fraud and Abuse Act (“CFAA”) to enjoin hiQ from scraping member data from LinkedIn’s websites. This decision comes in the wake of the Supreme Court’s decision in Van Buren v. United States.

As a reminder, data scraping is a mechanism of extracting data from websites (including both public websites and websites not available to the public and accessible only to individuals with user accounts). There is no federal law that expressly prohibits the practice. As such, parties seeking to challenge the practice rely on statutes that predate the prevalence of data scraping. One such statute is the CFAA, which forbids individuals from intentionally accessing a protected computer without authorization or “exceed[ing] authorized access.”

The applicability of this statute is the central issue in the hiQ v. LinkedIn data-scraping saga that we have covered previously. To recap, hiQ is a data analytics company that filed its initial complaint against LinkedIn in 2017, alleging LinkedIn’s cease-and-desist letters to hiQ, followed by LinkedIn restricting hiQ’s access to its website, was anticompetitive and violated state and federal laws. The crux of hiQ’s complaint was that LinkedIn did not have monopoly rights to personal data made publicly available by its users, and that by scraping its website, hiQ did not violate users’ privacy rights (what LinkedIn alleges). LinkedIn, on the other hand, argued that hiQ was not entitled to a preliminary injunction, as its claims would be preempted by CFAA as a result of its unauthorized use of LinkedIn’s website.

The district court granted hiQ’s request for a preliminary injunction, forbidding LinkedIn from denying hiQ access to publicly available LinkedIn member profiles. In September 2019, the Ninth Circuit affirmed the lower court’s decision, reasoning that scraping publicly available information from LinkedIn is not a violation of the CFAA because the LinkedIn computers are publicly available, and thus, hiQ did not access the computers “without authorization” under the CFAA. LinkedIn filed a petition for writ of certiorari in March 2020, which the Supreme Court granted in June 2021. The Court issued a summary disposition, vacating the Ninth Circuit’s previous judgment, and remanding the case for additional consideration in light of the Court’s ruling in Van Buren.

As we explained here and here, the Supreme Court held in Van Buren that an individual who has legitimate access to a computer network but accesses it for an improper or unauthorized purpose does not violate the CFAA. Prior to Van Buren, several Circuits held that terms of service violations could implicate the CFAA. In rejecting this broad interpretation of the CFAA, the Court in Van Buren noted that such an interpretation “would attach criminal penalties to a breath taking amount of commonplace computer activity.” We predicted that the Van Buren Court’s holding would make it challenging to assert claims under the CFAA for terms of service violations, including for misuse of data or information contained on a company’s website that likely would have been deemed to have “exceed[ed] authorized access” under prior precedent.

The Ninth Circuit’s recent opinion confirmed our predictions. The prevailing issue that the Ninth Circuit addressed was whether hiQ’s continued scraping and use of LinkedIn member data following receipt of LinkedIn’s cease-and-desist letter constituted “without authorization” under the CFAA. In other words, the court considered whether “without authorization” encompasses situations in which prior authorization is not generally required, but a person—or bot—is refused access. Unsurprisingly, the Ninth Circuit held that the Supreme Court’s decision in Van Buren reinforced its prior holding, relying on the Court’s “gates-up-or-down” inquiry (i.e., if authorization is required and has been given, the dates are up; if authorization is required and has not been given, the gates are down). According to the Ninth Circuit, where information is on a publicly available website, “that computer has erected no gates to lift or lower in the first place.” Based on its reasoning, the court in hiQ v. LinkedIn articulated the following rule for CFAA liability:

[T]he CFAA’s prohibition on accessing a computer “without authorization” is violated when a person circumvents a computer’s generally applicable rules regarding access permissions, such as username and password requirements, to gain access to a computer. It is likely that when a computer network generally permits public access to its data, a user’s accessing that publicly available data will not constitute access without authorization under the CFAA.

In other words, only information that requires some prior authorization is encompassed under the CFAA. Publicly available information is generally not.

What does this mean for the future of data scraping litigation? Companies that maintain publicly available information on their websites cannot rely on the CFAA to prohibit others from scraping that data, even if the companies subsequently revoke access to the information, or if data scraping is a violation of the websites’ terms of use. Companies must require prior authorization, such as a username and password, to access the data in the first instance in order for scraping of that data to be actionable under the CFAA.

That is not to say that companies that maintain publicly available information are without any remedy. As the Ninth Circuit emphasized, victims of data scraping may potentially assert a state common law for trespass to chattels. Moreover, a breach of contract claim may also be viable, depending on whether the website’s terms may be deemed a browsewrap or clickwrap agreement and the relevant jurisdiction.

Stay tuned for how data scrape litigation will progress. CPW will keep you in the loop.

Meta, the parent company of Facebook, has sued Hong Kong based Social Data Trading Ltd. for scraping data from millions of Instagram and Facebook profiles.  Meta alleges that after it blocked Instagram and Facebook access to Social Data Trading, the company continued to surreptitiously pull profile information from both websites.   Meta alleges that Social Data Trading violated the terms of service for both Instagram and Facebook and that, because of circumventing Meta’s block Social Data Trading’s use of those websites, defendant also engaged in illegal hacking under Section 502 of California’s Penal Code.  Finally, Meta seeks recovery for unjust enrichment, in addition to its claims for breach of contract and hacking under Section 502.

The complaint does not contain any claims under 18 U.S.C. § 1030, the Computer Fraud and Abuse Act (“CFAA”).  This is an issue we are watching closely as, generally speaking, CFAA no longer supports claims for data scraping that merely violate a website’s terms of service.  Presently before the Ninth Circuit, on remand from the Supreme Court, is the question of whether, once a website revokes permission and takes technical steps to block access, a subsequent act of scraping a public website constitutes a violation of CFAA.   LinkedIn Corp. v. hiQ Labs, Inc., No. 19-1116, 141 S. Ct. 2752, 210 L. Ed. 2d 902, 2021 U.S. LEXIS 2997, 2021 WL 2405144, at *1 (U.S. June 14, 2021).  The Ninth Circuit heard oral argument on this case in October.  We anticipate a ruling soon.  As always, watch this space for an update.

In the aftermath of the Supreme Court’s Van Buren decision this month and its resulting impact on data privacy litigation, the Supreme Court ordered the hiQ/LinkedIn data scraping saga to be remanded back to the Ninth Circuit.

Recall that in March 2020, LinkedIn filed a petition for a writ of certiorari, raising the issue of “[w]hether a company that deploys anonymous computer “bots” to circumvent technical barriers and harvest millions of individuals’ personal data from computer servers that host public-facing websites—even after the computer servers’ owner has expressly denied permission to access the data—“intentionally accesses a computer without authorization” in violation of the Computer Fraud and Abuse Act.” [Note: Of course, it is all about framing.  According to hiQ, the question was instead whether a professional networking website, such as LinkedIn), may rely on CFAA’s prohibition on “intentionally access[ing] a computer without authorization” to prevent a competitor from accessing information that the website’s users have shared on their public profiles and that is available for viewing by anyone with a web browser.]

Well, on June 14, the Supreme Court issued a summary disposition in hiQ Labs, Inc. v. LinkedIn Corp. granting certiorari.  The Court vacated the Ninth Circuit’s previous judgment, and remanding the case for additional consideration in light of the high court’s ruling in Van Buren.  This case is sure to be of interest going forward, as Van Buren’s impact continues to play out in the lower courts.  Stay tuned-CPW will be there to keep you in the loop.

As readers of CPW know, data scraping is a hot button data privacy issue.  We previously covered the hiQ/LinkedIn data-scraping saga HERE, and HERE.  In the most recent ruling out of the Northern District of California, Judge Chen denied hiQ’s motion to dismiss LinkedIn’s counterclaims for breach of contract, misappropriation, and trespass to chattels.  Additionally, the Court deferred ruling on the motion to dismiss counterclaims for violation of the Computer Fraud and Abuse Act (“CFAA”) and California Penal Code § 502, pending the Supreme Court’s ruling on LinkedIn’s petition for a writ of certiorari.

What question is pending before the SCOTUS in LinkedIn’s petition for writ?  As LinkedIn phrases it, the issue is “[w]hether a company that deploys anonymous computer ‘bots’ to circumvent technical barriers and harvest millions of individuals’ personal data from computer servers that host public-facing websites—even after the computer servers’ owner has expressly denied permission to access the data—‘intentionally accesses a computer without authorization’ in violation of the Computer Fraud and Abuse Act.”  [Note: In hiQ’s framing, the question is instead whether a professional networking website, such as LinkedIn), may rely on CFAA’s prohibition on “intentionally access[ing] a computer without authorization” to prevent a competitor from accessing information that the website’s users have shared on their public profiles and that is available for viewing by anyone with a web browser.]

In addition to LinkedIn’s petition, the question of “when does a person exceed authorized access under the CFAA?” is also pending before SCOTUS in the case of United States v. Van Buren, 940 F.3d 1192 (11th Cir. 2019), although it involves different facts than the present litigation.  140 S. Ct. 2667 (2020).  According to Judge Chen, both decisions may “have an impact on the instant case.”  And “[t]he Court will be in a better position to address the counterclaim[s] once the Supreme Court has issued its decision in Van Buren and/or the instant case.”

Since the specific question pending before the SCOTUS relates to the meaning of “unauthorized access” under CFAA, it was not surprising that Judge Chen deferred the ruling on the CFAA claims until after the SCOTUS has issued its decision.  What was somewhat more surprising, or interesting, was the Court also deferring ruling under the California Penal Code § 502, pending the SCOTUS ruling.  The Court agreed that although § 502 is not on all fours with the CFAA, the question of whether “as a matter of policy, the use of public information should be deemed criminal conduct” was deemed related to the question of “unauthorized access” under CFAA.

For you novices out there, California Penal Code § 502 makes it unlawful to “knowingly” and “without permission” access, alter, damage, delete, destroy, or otherwise use any data, computer or computer system or network.  In contrast to the CFAA, § 502 does not require “unauthorized access” but rather “knowingly access,” “without permission.”  In other words, what makes the access unlawful, is that the person “without permission” takes, copies, or makes use of’ the data.  Some may say § 502 is more restrictive than CFAA, but regardless, there is no question that both of the currently unanswered questions are bound to have a significant impact in the data-scraping arena.

Regarding hiQ’s motion to dismiss LinkedIn’s counterclaims for breach of contract, misappropriation, and trespass to chattels, the Court considered those adequately pled, raising only factual disputes and questions, which are not meant to be addressed at the pleading stage.  At bottom, hiQ was not successful in its motion to dismiss, but to be fair, the true victory in this case is squarely dependent on the question pending before the SCOTUS.  Stayed tuned for that.  CPW will be there.

CPW has previously covered the state of play for data scraping litigation in the context of hiQ’s and LinkedIn’s ongoing dispute.  For an update on this litigation, read on below.

As a reminder, data scraping is a mechanism of extracting data from websites (including websites not available to the public and accessible only to individuals with user accounts).  The practices of Clearview which has been the subject of recent litigation are a prime example.  Notwithstanding the clear privacy issues implicated by data scraping, there is no law specifically regulating this practice nationwide (although some state laws, as CPW has already covered, regulate the collection of biometric data).  As such, in litigation regarding data scraping, parties are stuck arguing over the application of various statutes that were enacted long before data scraping was prevalent.

Which brings us to the hiQ v. LinkedIn litigation.  In their recently filed, Joint Case Management Statement, both parties have made it clear that they are not backing away from proving that their definition of “unauthorized access” under the Computer Fraud and Abuse Act (the “CFAA”), is what the legislators intended, back in 1984, and what needs to prevail in the court of law. LinkedIn’s petition for certiorari on this question is still pending before the Supreme Court of the United States.  See our previous blog for more background on that, the CFAA, and this lawsuit.

As you will recall, hiQ filed its initial complaint against LinkedIn in 2017, alleging LinkedIn’s cease-and-desist letters to hiQ, followed by LinkedIn restricting hiQ’s access to its website, was anticompetitive and violated state and federal laws.  The crux of hiQ’s complaint was that LinkedIn did not have monopoly rights to personal data made publicly available by its users, and that by scraping its website, hiQ did not violate users’ privacy rights (what LinkedIn alleges).

The Northern District of California granted hiQ’s request for a preliminary injunction against LinkedIn restricting hiQ’s access to publicly available LinkedIn member profiles.  LinkedIn appealed, but the appeal was denied. The Ninth Circuit determined that “scraping” publicly available information from LinkedIn likely is not a violation of the CFAA because the LinkedIn computers are publicly accessible.  As such, hiQ did not access the computers “without authorization” as required by the CFAA.  LinkedIn then filed its petition for certiorari to the SCOTUS, which is currently pending.

Separate from the preliminary injunction, following Judge Chen’s ruling on its motion to dismiss, LinkedIn filed its Answer and Counterclaims on November 20, 2020, including counterclaims under the CFAA.  On January 18, 2021, hiQ filed a motion to dismiss LinkedIn’s counterclaims under the CFAA.  LinkedIn’s opposition is currently due on March 4, 2021.  It is unlikely that the SCOTUS will respond on the pending certiorari before then.

In addition to the meaning of unauthorized access, the parties are also disputing:

  • Whether the California Penal Code § 502 is simply coextensive with the CFAA or has a broader scope, whether it applies to public profile pages on the LinkedIn website and, if it does apply, whether hiQ’s access to such profiles violates the statute;
  • Whether any of hiQ’s claims are preempted by the CFAA or California Penal Code § 502; and
  • Whether hiQ breached the LinkedIn User Agreement, which specifically prohibits automated access and scraping.

Factual disputes also exist, including the method that hiQ purportedly used to gain access to LinkedIn’s computers, and scrape data of LinkedIn’s members (i.e. the use of automated software/”bots”), and whether hiQ “knowingly” bypassed LinkedIn’s technical measures.  LinkedIn is also disputing whether hiQ’s loss of employees, and its difficulty in signing customers or raising money and investments (including all of the reasons why hiQ’s business has failed, if indeed), is in fact related to LinkedIn’s cease and desist letter or other actions.  LinkedIn has pointed to the vulnerability of “startups” in general as a likely cause of hiQ’s alleged business failure.

As we mentioned earlier, the questions posed in this lawsuit are bound to have major impact in the data scraping arena.  Stay tuned.

 

Digital Facial RecognitionLast week (9th July), the ICO announced that it would join forces with the Office of the Australian Information Commissioner (OAIC) to investigate the use of personal information, including biometric data, by Clearview AI, Inc. (Clearview). Limited information is available so far, but given the focus of the investigation, this is an important step in determining data protection rights and obligations, where information is ‘scraped’ from ‘publicly available’ sources, for the purposes of tackling crime. Continue Reading ICO and Australian Information Commissioner Team-up to Investigate Clearview AI, Inc. Facial Recognition Tool and Data Scraping