Featured Posts

Targeted Copyright Enforcement vs. Inaccurate Enforcement

Let's continue our discussion about copyright enforcement against online infringers. I wrote last time about how targeted enforcement can deter many possible violators even if the enforcer can only punish a few violators. Clever targeting of enforcement can destroy the safety-in-numbers effect that might otherwise shelter a crowd of would-be violators.

In the online copyright context, the implication is that large copyright owners might be able to use lawsuit threats to deter a huge population of would-be infringers, even if they can only manage to sue a few infringers at a time. In my previous post, I floated some ideas for how they might do this.

Today I want to talk about the implications of this. Let's assume, for the sake of argument, that copyright owners have better deterrence strategies available -- strategies that can deter more users, more effectively, than they have managed so far. What would this imply for copyright policy?

The main implication, I think, is to shed doubt on the big copyright owners' current arguments in favor or broader, less accurate enforcement. These proposed enforcement strategies go by various names, such as "three strikes" and "graduated response". What defines them is that they reduce the cost of each enforcement action, while at the same time reducing the assurance that the party being punished is actually guilty.

Typically the main source of cost reduction is the elimination of due process for the accused. For example, "three strikes" policies typically cut off someone's Internet connection if they are accused of infringement three times -- the theory being that making three accusations is much cheaper than proving one.

There's a hidden assumption underlying the case for cheap, inaccurate enforcement: that the only way to deter infringement is to launch a huge number of enforcement actions, so that most of the would-be violators will expect to face enforcement. The main point of my previous post is that this assumption is not necessarily true -- that it's possible, at least in principle, to deter many people with a moderate number of enforcement actions.

Indeed, one of the benefits of an accurate enforcement strategy -- a strategy that enforces only against actual violators -- is that the better it works, the cheaper it gets. If there are few violators, then few enforcement actions will be needed. A high-compliance, low-enforcement equilibrium is the best outcome for everybody.

Cheap, inaccurate enforcement can't reach this happy state.

Let's say there are 100 million users, and you're using an enforcement strategy that punishes 50% of violators, and 1% of non-violators. If half of the people are violators, you'll punish 25 million violators, and you'll punish 500,000 non-violators. That might seem acceptable to you, if the punishments are small. (If you're disconnecting 500,000 people from modern communications technology, that would be a different story.)

But now suppose that user behavior shifts, so that only 1% of users are violating. Then you'll be punishing 500,000 violators (50% of the 1,000,000 violators) along with 990,000 non-violators (1% of the 99,000,000 non-violators). Most of the people you'll be punishing are innocent, which is clearly unacceptable.

Any cheap, inaccurate enforcement scheme will face this dilemma: it can be accurate, or it can be fair, but it can't be both. The better is works, the more unfair it gets. It can never reach the high-compliance, low-enforcement equilibrium that should be the goal of every enforcement strategy.

Tagged:  

Targeted Copyright Enforcement: Deterring Many Users with a Few Lawsuits

One reason the record industry's strategy of suing online infringers ran into trouble is that there are too many infringers to sue. If the industry can only sue a tiny fraction of infringers, then any individual infringer will know that he is very unlikely to be sued, and deterrence will fail.

Or so it might seem -- until you read The Dynamics of Deterrence, a recent paper by Mark Kleiman and Beau Kilmer that explains how to deter a great many violators despite limited enforcement capacity.

Consider the following hypothetical. There are 26 players, whom we'll name A through Z. Each player can choose whether or not to "cheat". Every player who cheats gets a dollar. There's also an enforcer. The enforcer knows exactly who cheated, and can punish one (and only one) cheater by taking $10 from him. We'll assume that players have no moral qualms about cheating -- they'll do whatever maximizes their expected profit.

This situation has two stable outcomes, one in which nobody cheats, and the other in which everybody cheats. The everybody-cheats outcome is stable because each player figures that he has only a 1/26 chance of facing enforcement, and a 1/26 chance of losing $10 is not enough to scare him away from the $1 he can get by cheating.

It might seem that deterrence doesn't work because the cheaters have safety in numbers. It might seem that deterrence can only succeed by raising the penalty to more than $26. But here comes Kleiman and Kilmer's clever trick.

The enforcer gets everyone together and says, "Listen up, A through Z. From now on, I'm going to punish the cheater who comes first in the alphabet." Now A will stop cheating, because he knows he'll face certain punishment if he cheats. B, knowing that A won't cheat, will then realize that if he cheats, he'll face certain punishment, so B will stop cheating. Now C, knowing that A and B won't cheat, will reason that he had better stop cheating too. And so on ... with the result that nobody will cheat.

Notice that the trick still works even if punishment is not certain. Suppose each cheater has an 80% chance of avoiding detection. Now A is still deterred, because even a 20% chance of being fined $10 outweighs the $1 benefit of cheating. And if A is deterred, then B is deterred for the same reason, and so on.

Notice also that this trick might work even if some of the players don't think things through. Suppose A through J are all smart enough not to cheat, but K is clueless and cheats anyway. K will get punished. If he cheats again, he'll get punished again. K will learn quickly, by experience, that cheating doesn't pay. And once K learns not to cheat, the next clueless player will be exposed and will start learning not to cheat. Eventually, all of the clueless players will learn not to cheat.

Finally, notice that there's nothing special about using alphabetical order. The enforcer could use reverse alphabetical or any other order, and the same logic would apply. Any ordering will do, as long as each player knows where he is in the order.

Now let's apply this trick to copyright deterrence. Suppose the RIAA announces that from now on they're going to sue the violators who have the lowest U.S. IP addresses. Now users with low IP addresses will have a strong incentive to avoid infringing, which will give users with slightly higher IP addresses a stronger incentive to avoid infringing, and so on.

You might object that infringers aren't certain to get caught, or that infringers might be clueless or irrational, or that IP address order is arbitrary. But I explained above why these objections aren't necessarily showstoppers. Players might still be deterred even if detection is a probability rather than a certainty; clueless players might still learn by experience; and an arbitrary ordering can work perfectly well.

Alternatively, the industry could use time as an ordering, by announcing, for example, that starting at 8:00 PM Eastern time tomorrow evening, they will sue the first 1000 U.S. users they see infringing. This would make infringing at 8:00 PM much riskier than normal, which might keep some would-be infringers offline at that hour, which in turn would make infringing at 8:00 PM even riskier, and so on. The resulting media coverage ("I infringed at 8:02 and now I'm facing a lawsuit") could make the tactic even more effective next time.

(While IP address or time ordering might work, many other orderings are infeasible. For example, they can't use alphabetical ordering on the infringers' names, because they don't learn names until later in the process. The ideal ordering is one that can be applied very early in the investigative process, so that only cases at the beginning of the ordering need to be investigated. IP address and time ordering work well in this respect, as they are evident right away and are evident to would-be infringers.)

I'm not claiming that this trick will definitely work. Indeed, it would be silly to claim that it could drive online infringement to zero. But there's a chance that it would deter more infringers, for longer, than the usual approach of seemingly random lawsuits has managed to do.

This approach has some interesting implications for copyright policy, as well. I'll discuss those next time.

Tagged:  

New York AG Files Antitrust Suit Against Intel

Yesterday, New York's state Attorney General filed what could turn out to be a major antitrust suit against Intel. The suit accuses Intel of taking illegal steps to exclude a competitor, AMD, from the market.

All we have so far is the NYAG's complaint, which tells one side of the case. Intel will have ample opportunity to respond, and the NYAG will ultimately have the burden of backing up its allegations with proof -- so caution is in order at this point. Still, the complaint lays out the shape of the NYAG's case.

The case concerns the market for x86-compatible microprocessors, which are the "brains" of most personal computers. Intel dominates this market but a rival company, AMD, has long been trying to build market share. The complaint offers a long narrative of Intel's (and AMD's) relationships with major PC makers ("OEMs", in the jargon) such as Dell, HP, and IBM -- the customers who buy x86 processors from Intel and AMD.

The crux of the case is the allegation that Intel paid OEMs to not buy from AMD. This is reminiscent of one aspect of the big Microsoft antitrust case of 1998, in which one of the DOJ's claims was that Microsoft had paid people not to do business with Netscape.

I'll leave it to the experts to debate the economic niceties, but as I understand it there is a distinction between paying someone to buy more of your product (e.g. giving a volume discount) as opposed to paying someone to buy less of your rival's product. The former is generally fine, but if you have monopoly power the latter is suspect.

As the NYAG tells it, Intel tried to pretend the payments were for something else, but the participants knew what was really going on: that the payments would stop if an OEM started buying more from AMD. The evidence on this point could turn out to be important. Does the NYAG have "smoking gun" emails in which Intel made this explicit? Does the evidence show that OEMs understood the arrangement as the NYAG claims? I assume there's a huge trove of email evidence that both sides will be digesting.

It will be interesting to watch this case develop. Thanks to tools like RECAP, many of the case documents will be available to the public. Stay tuned for more improvements to RECAP that will provide even better access.

Tagged:  

Election Day; More Unguarded Voting Machines

It's Election Day in New Jersey. As usual, I visited several polling places in Princeton over the last few days, looking for unguarded voting machines. It's been well demonstrated that a bad actor who can get physical access to a New Jersey voting machine can modify its behavior to steal votes, so an unguarded voting machine is a vulnerable voting machine.

This time I visited six polling places. What did I find?

The good news -- and there was a little -- is that in one of the six polling places, the machines were properly secured. I'm not sure where the machines were, but I know that they were not visible anywhere in the accessible areas of the building. Maybe the machines were locked in a storage room, or maybe they hadn't been delivered yet, but anyway they were probably safe. This is the first time I have ever found a local polling place, the night before the election, with properly secured voting machines.

At the other five polling places, things weren't so good. At three places, the machines were unguarded in an area open to the public. I walked right up to them and had private time with them. In two other places, the machines were visible from outside the building and protected only by an outside door with an easily defeated lock. I didn't defeat the locks myself -- I wasn't going to cross that line -- but I'll bet you could have opened them quickly with tools you probably have in your car.

The final scorecard: ten machines totally unprotected, eight machines poorly protected, two machines well-protected. That's an improvement, but then again any protection at all would have been an improvement. We still have a long way to go.

Sequoia Announces Voting System with Published Code

Sequoia Voting Systems, one of the major e-voting companies, announced Tuesday that it will publish all of the source code for its forthcoming Frontier product. This is great news--an important step toward the kind of transparency that is necessary to make today's voting systems trustworthy.

To be clear, this will not be a fully open source system, because it won't give users the right to modify and redistribute the software. But it will be open in a very important sense, because everyone will be free to inspect, analyze, and discuss the code.

Significantly, the promise to publish code covers all of the systems involved in running the election and reporting results, "including precinct and central count digital optical scan tabulators, a robust election management and ballot preparation system, and tally, tabulation, and reporting applications". I'm sure the research community will be eager to study this code.

The trend toward publishing election system source code has been building over the last few years. Security experts have long argued that public scrutiny tends to increase security, and is one of the best ways to justify public trust in a system. Independent studies of major voting vendors' source code have found code quality to be disappointing at best, and vendors' all-out resistance to any disclosure has eroded confidence further. Add to this an increasing number of independent open-source voting systems, and secret voting technologies start to look less and less viable, as the public starts insisting that longstanding principles of election transparency be extended to election technology. In short, the time had come for this step.

Still, Sequoia deserves a lot of credit for being the first major vendor to open its technology. How long until the other major vendors follow suit?

DRM by any other name: The latest from Hollywood

Sunday's New York Times had an article, Studios' Quest for Life After DVDs. To nobody's surprise, consumers want to have convenient access to "their" media, wherever they happen to be, without all the annoying restrictions that come into play when you add DRM to the picture. To many people's surprise, sales of DVDs (much less Blu-ray) are in trouble.

In the third quarter, studios’ home entertainment divisions generated about $4 billion, down 3.2 percent from a year ago, according to the Digital Entertainment Group, a trade consortium. But digital distribution contributed just $420 million, an increase of 18 percent.

Given that DVDs are really a luxury good (versus, say, food or electricity), the 3.2 percent drop seems like Hollywood is getting off easy. The growth in digital distribution is clearly getting attention, though. What's going on here? I imagine several things. People sometimes miss their shows. Maybe the cable went out. Maybe the TiVo crashed. Maybe they're on the road. Drop $2 at the iTunes Store and you're good to go. That's attractive and it's real money.

Still, the article goes on to talk about... yet more DRM.

Standing in the way are technology hurdles — how to let consumers play a video on various devices without letting them share it with 10,000 close friends on a pirate site — and the reluctance of studios to cooperate too closely with rivals for reasons of antitrust scrutiny and sheer competitiveness.
...
And piracy, at least conceptually, would be less of a worry. The technology [Disney's Keychest] rests on cloud computing, in which huge troves of data are stored on remote servers so users have access from anywhere. Movies would be streamed from the cloud and never downloaded, making them harder to pirate.

Of course, this is baloney. If it's going to work on my iPhone while I'm sitting in an airplane, the entire video needs to be stored there in advance. Furthermore, if the video is supposed to be "high definition," that's a bare minimum of 5 megabits/sec. (Broadcast HD is 20 megabits/sec and Blu-ray is 48 megabits/sec.) Most home DSL or cable modem connections either will never go that fast, or certainly cannot maintain those speeds without hiccups, particularly when sharing the line with other users. To do high quality video, you either have to have a real broadcast medium (cable, over-the-air, or satellite) or you have to download in advance and store on a hard drive.

And, of course, once you've stored the video, it's just not that hard to extract it. And it always will be. The challenge for Hollywood is to change the incentives of the game. Maybe sell me a flat-rate subscription. Maybe bundle it with my DSL provider. But make the experience compelling enough and cheap enough, and I'll do it. I regularly extract video from my TiVo and copy it to my iPhone via third-party software. It's practically painless and it happens to yield files that I could share with the world, but I don't. Why? Because there's real downside (I'd rather not get sued, thanks), and no particular upside.

So, dearest Hollywood executive, consider that selling your content for a reduced price, with no DRM, is not the same thing as "giving it away." If you allow third-parties to license your content and distribute it without DRM, you can still go after the "pirates", yet you'll allow normal people to enjoy your work without making them suffer for it. Yes, you may have kids copying content from one to the next, just like we used to do dubbing cassette tapes, but those incremental losses can and will be offset by the incremental gains of people enjoying your work and hitting the "buy" button.

Tagged:  

There’s anonymity on the Internet. Get over it.

In a recent interview prominent antivirus developer Eugene Kaspersky decried the role of anonymity in cybercrime. This is not a new claim – it is touched on in the Commission on Cybersecurity for the 44th Presidency Report and Cybersecurity Act of 2009, among others – but it misses the mark. Any Internet design would allow anonymity. What renders our Internet vulnerable is primarily weakness of software security and authentication, not anonymity.

Consider a hypothetical of three Internet users: Alice, Bob, and Charlie. If Alice wants to communicate anonymously with Charlie, she may relay her messages through Bob. While Charlie knows Bob is an intermediary, Charlie does not know with whom he is ultimately communicating. For even greater anonymity Alice can pass her messages through multiple Bobs, and by applying cryptography she can ensure no individual Bob can piece together that she is communicating with Charlie. This basic approach to anonymity is remarkable in its independence of the Internet’s design: it only requires that some Bob(s) can and do run intermediary software. Even on an Internet where users could verify each other’s identity this means of anonymity would remain viable.

The sad state of software security – the latest DHS weekly bulletin alone identified over 40 “high severity” vulnerabilities – is what enables malicious users to exploit the Internet’s indelible capacity for anonymity. Modifying the prior hypothetical, suppose Alice now wants to spam, phish, denial of service (DoS) attack, or hack Charlie. After compromising Bob’s computer with malicious software (malware), Alice can send emails, host websites, and launch DoS attacks from it; Charlie knows Bob is apparently misbehaving, but has no means of discovering Alice’s role. Nearly all spam, phishing, and DoS attacks are now perpetrated with networks of compromised computers like Bob’s (botnets). At the writing of a July 2009 private sector report, just five botnets sourced nearly 75% of spam. Worse yet, botnets are increasingly self-perpetuating: spam and phishing websites propagate malware that compromises new computers for the botnet.

Shortcomings in authentication, the means of proving one’s identity either when necessary or at all times, are a secondary contributor to the Internet’s ills. Most applications rely on passwords, which are easily guessed or divulged through deception – the very mechanisms of most phishing and account hijacking. There are potential technical solutions that would enable a user to authenticate themselves without the risk of compromising accounts. But any approach will be undermined by weaknesses in underlying software security when a malicious party can trivially compromise a user’s computer.

The policy community is already trending towards acceptance of Internet anonymity and refocusing on software security and authentication; the recent White House Cyberspace Policy Review in particular emphasizes both issues. To the remaining unpersuaded, I can only offer at last a truism: There’s anonymity on the Internet. Get over it.

Net Neutrality: When is Network Management "Reasonable"?

Last week the FCC released its much-awaited Notice of Proposed Rulemaking (NPRM) on network neutrality. As expected, the NPRM affirms past FCC neutrality principles, and adds two more. Here's the key language:

1. Subject to reasonable network management, a provider of broadband Internet access service may not prevent any of its users from sending or receiving the lawful content of the user's choice over the Internet.

2. Subject to reasonable network management, a provider of broadband Internet access service may not prevent any of its users from running the lawful applications or using the lawful services of the user's choice.

3. Subject to reasonable network management, a provider of broadband Internet access service may not prevent any of its users from connecting to and using on its network the user's choice of lawful devices that do not harm the network.

4. Subject to reasonable network management, a provider of broadband Internet access service may not deprive any of its users of the user's entitlement to competition among network providers, application providers, service providers, and content providers.

5. Subject to reasonable network management, a provider of broadband Internet access service must treat lawful content, applications, and services in a nondiscriminatory manner.

6. Subject to reasonable network management, a provider of broadband Internet access service must disclose such information concerning network management and other practices as is reasonably required for users and content, application, and service providers to enjoy the protections specified in this part.

That's a lot of policy packed into (relatively) few words. I expect that my colleagues and I will have a lot to say about these seemingly simple rules over the coming weeks.

Today I want to focus on the all-purpose exception for "reasonable network management". Unpacking this term might tell us a lot about how the proposed rule would operate.

Here's what the NPRM says:

Reasonable network management consists of: (a) reasonable practices employed by a provider of broadband Internet access to (i) reduce or mitigate the effects of congestion on its network or to address quality-of-service concerns; (ii) address traffic that is unwanted by users or harmful; (iii) prevent the transfer of unlawful content; or (iv) prevent the unlawful transfer of content; and (b) other reasonable network management practices.

The key word is "reasonable", and in that respect the definition is nearly circular: in order to be "reasonable", a network management practice must be (a) "reasonable" and directed toward certain specific ends, or (b) "reasonable".

In the FCC's defense, it does seek comments and suggestions on what the definition should be, and it does say that it intends to make case-by-case determinations in practice, as it did in the Comcast matter. Further, it rejects a "strict scrutiny" standard of the sort that David Robinson rightly criticized in a previous post.

"Reasonable" is hard to define because in real life every "network management" measure will have tradeoffs. For example, a measure intended to block copyright-infringing material would in practice make errors in both directions: it would block X% (less than 100%) of infringing material, while as a side-effect also blocking Y% (more than 0%) of non-infringing material. For what values of X and Y is such a measure "reasonable"? We don't know.

Of course, declaring a vague standard rather than a bright-line rule can sometimes be good policy, especially where the facts on the ground are changing rapidly and it's hard to predict what kind of details might turn out to be important in a dispute. Still, by choosing a case-by-case approach, the FCC is leaving us mostly in the dark about where it will draw the line between "reasonable" and "unreasonable".

Intractability of Financial Derivatives

A new result by Princeton computer scientists and economists shows a striking application of computer science theory to the field of financial derivative design. The paper is Computational Complexity and Information Asymmetry in Financial Products by Sanjeev Arora, Boaz Barak, Markus Brunnermeier, and Rong Ge. Although computation has long been used in the financial industry for program trading and "the thermodynamics of money", this new paper applies an entirely different kind of computer science: Intractability Theory.

A financial derivative is a contract specifying a payoff calculated by some formula based on the yields or prices of a specific collection of underlying assets. Consider the securitization of debt: a CDO (collateralized debt obligation) is a security formed by packaging together hundreds of home mortgages. The CDO is supposedly safer than the individual mortgages, since it spreads the risk (not every mortgage is supposed to default at once). Furthermore, a CDO is usually divided into "senior tranches" which are guaranteed not to drop in value as long as the total defaults in the pool does not exceed some threshhold; and "junior tranches" that are supposed to bear all the risk.

Trading in derivatives brought down Lehman Brothers, AIG, and many other buyers, based on mistaken assumptions about the independence of the underlying asset prices; they underestimated the danger that many mortgages would all default at the same time. But the new paper shows that in addition to that kind of danger, risks can arise because a seller can deliberately construct a derivative with a booby trap hiding in plain sight.

It's like encryption: it's easy to construct an encrypted message (your browser does this all the time), but it's hard to decrypt without knowing the key (we believe even the NSA doesn't have the computational power to do it). Similarly, the new result shows that the seller can construct the CDO with a booby trap, but even Goldman Sachs won't have enough computational power to analyze whether a trap is present.

The paper shows the example of a high-volume seller who builds 1000 CDOs from 1000 asset-classes of home mortages. Suppose the seller knows that a few of those asset classes are "lemons" that won't pay off. The seller is supposed to randomly distribute the asset classes into the CDOs; this minimizes the risk for the buyer, because there's only a small chance that any one CDO has more than a few lemons. But the seller can "tamper" with the CDOs by putting most of the lemons in just a few of the CDOs. This has an enormous effect on the senior tranches of those tampered CDOs.

In principle, an alert buyer can detect tampering even if he doesn't know which asset classes are the lemons: he simply examines all 1000 CDOs and looks for a suspicious overrepresentation of some of the asset classes in some of the CDOs. What Arora et al. show is that is an NP-complete problem ("densest subgraph"). This problem is believed to be computationally intractable; thus, even the most alert buyer can't have enough computational power to do the analysis.

Arora et al. show it's even worse than that: even after the buyer has lost a lot of money (because enough mortgages defaulted to devalue his "senior tranche"), he can't prove that that tampering occurred: he can't prove that the distribution of lemons wasn't random. This makes it hard to get recourse in court; it also makes it hard to regulate CDOs.

Intractability Theory forms the basis for several of the technologies discussed on Freedom-to-Tinker: cryptography, digital-rights management, watermarking, and others. Perhaps now financial policy is now another one.

Tagged:  

Sidekick Users' Data Lost: Blame the Cloud?

Users of Sidekick mobile phones saw much of their data disappear last week due to engineering problems at a Microsoft data center. Sidekick devices lose the contents of their memory when they don't have power (e.g. when the battery is being changed), so all data is transmitted to a data center for permanent storage -- which turned out not to be so permanent.

(The latest news is that some of the data, perhaps most of it, may turn out to be recoverable.)

A common response to this story is that this kind of danger is inherent in "cloud" computing services, where you rely on some service provider to take care of your data. But this misses the point, I think. Preserving data is difficult, and individual users tend to do a mediocre job of it. Admit it: You have lost your own data at some point. I know I have lost some of mine. A big, professionally run data center is much less likely to lose your data than you are.

It's worth noting, too, that many cloud services face lower risk of this sort of problem. My email, for example, lives in the cloud--the "official copy" is on a central server, and copies are downloaded frequently to my desktop and laptop computers. If the server were to go up in flames, along with all of the server backups, I would still be in good shape, because I would still have copies of everything on my desktop and laptop.

For my email and similar services, the biggest risk to data integrity is not that the server will disappear altogether, but that the server will misbehave in subtle ways, causing my stored data to be corrupted over time. Thanks to the automatic synchronization between the server and my two clients (desktop and laptop), bad data could be replicated silently into all copies. In principle, some of the damage could be repaired later, using the server's backups, but that's a best case scenario.

This risk, of buggy software corrupting data, has always been with us. The question is not whether problems will happen in the cloud -- in any complex technology, trouble comes with the territory -- but whether the cloud makes a problem worse.

Syndicate content