Bio


Benjamin Van Roy is broadly interested in the formulation and analysis of mathematical models that address problems in information technology, business, and public policy. He is an Associate Professor of Management Science and Engineering, Electrical Engineering, and, by courtesy, Computer Science, at Stanford University. He has held visiting positions as the Wolfgang and Helga Gaul Visiting Professor at the University of Karlsruhe and as the Chin Sophonpanich Foundation Professor of Banking and Finance at Chulalongkorn University. He has been involved with several technology companies as a researcher, advisor, founder, or director.

Academic Appointments


Professional Education


  • BS, Massachusetts Institute of Technology, Computer Science and Engineering (1993)
  • MS, Massachusetts Institute of Technology, Electrical Engineering and Computer Science (1995)
  • PhD, Massachusetts Institute of Technology, Electrical Engineering and Computer Science (1998)

2015-16 Courses


All Publications


  • Learning to Optimize via Posterior Sampling MATHEMATICS OF OPERATIONS RESEARCH Russo, D., Van Roy, B. 2014; 39 (4): 1221-1243
  • Directed Principal Component Analysis OPERATIONS RESEARCH Kao, Y., Van Roy, B. 2014; 62 (4): 957-972
  • Learning to Optimize Via Posterior Sampling. Russo, D., Roy, B., Van
  • Reputation Markets Yan, X., Van Roy, B.
  • Directed Principal Component Analysis. Kao, Y., H., Roy, B., Van
  • Adaptive Execution: Exploration and Learning of Price Impact. Park, B., Roy, B., Van
  • Learning a factor model via regularized PCA MACHINE LEARNING Kao, Y., Van Roy, B. 2013; 91 (3): 279-303
  • Eluder Dimension and the Sample Complexity of Optimistic Exploration to appear in NIPS. Russo, D., Roy, B., Van 2013
  • Efficient Exploration and Value Function Generalization in Deterministic Systems abridged version to appear in NIPS. Wen, Z., Roy, B., Van 2013
  • (More) Efficient Reinforcement Learning via Posterior Sampling to appear in NIPS. Osband, I., Russo, D., Roy, B., Van 2013
  • Strategic execution in the presence of an uninformed arbitrageur JOURNAL OF FINANCIAL MARKETS Moallemi, C. C., Park, B., Van Roy, B. 2012; 15 (4): 361-391
  • Intermediated Blind Portfolio Auctions MANAGEMENT SCIENCE Padilla, M., Van Roy, B. 2012; 58 (9): 1747-1760
  • Portfolio selection with qualitative input JOURNAL OF BANKING & FINANCE Chiarawongse, A., Kiatsupaibul, S., Tirapat, S., Van Roy, B. 2012; 36 (2): 489-496
  • Efficient Reinforcement Learning for High Dimensional Linear Systems Advances in Neural Information Processing Systems 25 Ibrahimi, M., Javanmard, A., Roy, B., Van MIT Press. 2012
  • Approximate Dynamic Programming for Optimizing Oil Production Chapter 25 in Reinforcement Learning and Approximate Dynamic Programming for Feedback Control Wen, Z., Durlofsky, L., J., Roy, B., Van, Aziz, K. edited by Lewis, F., L., Liu, D. Wiley-IEEE Press. 2012
  • Industry dynamics: Foundations for models with an infinite number of firms JOURNAL OF ECONOMIC THEORY Weintraub, G. Y., Benkard, C. L., Van Roy, B. 2011; 146 (5): 1965-1994
  • Resource Allocation via Message Passing INFORMS JOURNAL ON COMPUTING Moallemi, C. C., Van Roy, B. 2011; 23 (2): 205-219
  • Control of Diffusions via Linear Programming in Stochastic Programming: The State of the Art, in Honor of George B. Dantzig Han, J., Van Roy, B. Springer. 2011: 329-354
  • Use of Approximate Dynamic Programming for Production Optimization Wen, Z., Durlofsky, L., J., Roy, B., Van, Aziz, K. 2011
  • Manipulation Robustness of Collaborative Filtering MANAGEMENT SCIENCE Van Roy, B., Yan, X. 2010; 56 (11): 1911-1929
  • Investment and Market Structure in Industries with Congestion OPERATIONS RESEARCH Johari, R., Weintraub, G. Y., Van Roy, B. 2010; 58 (5): 1303-1317
  • On Regression-Based Stopping Times DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS Van Roy, B. 2010; 20 (3): 307-324
  • Computational Methods for Oblivious Equilibrium OPERATIONS RESEARCH Weintraub, G. Y., Benkard, C. L., Van Roy, B. 2010; 58 (4): 1247-1265
  • Universal Reinforcement Learning IEEE TRANSACTIONS ON INFORMATION THEORY Farias, V. F., Moallemi, C. C., Van Roy, B., Weissman, T. 2010; 56 (5): 2441-2454
  • Convergence of Min-Sum Message-Passing for Convex Optimization IEEE TRANSACTIONS ON INFORMATION THEORY Moallemi, C. C., Van Roy, B. 2010; 56 (4): 2041-2050
  • Dynamic Pricing with a Prior on Market Response OPERATIONS RESEARCH Farias, V. F., Van Roy, B. 2010; 58 (1): 16-29
  • Convergence of the Min-Sum Algorithm for Convex Optimization IEEE Transactions on Information Theory Moallemi, C., C., Roy, B., Van 2010; 56 (4): 2041-2050
  • Convergence of Min-Sum Message Passing for Quadratic Optimization IEEE TRANSACTIONS ON INFORMATION THEORY Moallemi, C. C., Van Roy, B. 2009; 55 (5): 2413-2423
  • Manipulation-Resistant Collaborative Filtering Systems Roy, B., Van, Yan, X. 2009
  • Directed Regression Advances in Neural Information Processing Systems 22 Kao, Y., H., Van Roy, B. MIT Press. 2009: 889-897
  • Markov Perfect Industry Dynamics With Many Firms ECONOMETRICA Weintraub, G. Y., Benkard, C. L., Van Roy, B. 2008; 76 (6): 1375-1411

    View details for DOI 10.3982/ECTA6158

    View details for Web of Science ID 000261055400005

  • Capacity of the trapdoor channel with feedback Permuter, H., Cuff, P., Van Roy, B., Weissman, T. IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. 2008: 3150-3165
  • Approximate and Data-Driven Dynamic Programming for Queueing Networks Moallemi, C., C., Kumar, S., Van Roy, B. 2008
  • A short proof of optimality for the MIN cache replacement algorithm INFORMATION PROCESSING LETTERS Van Roy, B. 2007; 102 (2-3): 72-73
  • Managing the quality of a resource with stock and flow controls JOURNAL OF PUBLIC ECONOMICS Keohane, N., Van Roy, B., Zeckhauser, R. 2007; 91 (3-4): 541-569
  • Capacity and zero-error capacity of the chemical channel with feedback 2007 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY PROCEEDINGS, VOLS 1-7 Permuter, H., Cuff, P., Van Roy, B., Weissman, T. 2007: 1866-1870
  • An Approximate Dynamic Programming Approach to Network Revenue Management Farias, V., F., Van Roy, B. 2007
  • Consensus propagation Moallemi, C. C., Van Roy, B. IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. 2006: 4753-4766
  • A cost-shaping linear program for average-cost approximate dynamic programming with performance guarantees MATHEMATICS OF OPERATIONS RESEARCH de Farias, D. P., Van Roy, B. 2006; 31 (3): 597-620
  • Performance loss bounds for approximate value iteration with state aggregation MATHEMATICS OF OPERATIONS RESEARCH Van Roy, B. 2006; 31 (2): 234-244
  • A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS Choi, D., Van Roy, B. 2006; 16 (2): 207-239
  • Approximation algorithms for dynamic resource allocation OPERATIONS RESEARCH LETTERS Farias, V. F., Van Roy, B. 2006; 34 (2): 180-190
  • A nonparametric approach to multiproduct pricing OPERATIONS RESEARCH Rusmevichientong, P., Van Roy, B., Glynn, P. W. 2006; 54 (1): 82-98
  • Tetris: A Study of Randomized Constraint Sampling in Probabilistic and Randomized Methods for Design Under Uncertainty Farias, V., F., Van Roy, B. edited by Calafiore, G., Dabbene, F. Springer-Verlag. 2006
  • Opportunities and Challenges in Using Online Preference Data for Vehicle Pricing: A Case Study at General Motors Journal of Revenue and Pricing Management Rusmevichientong, P., Salisbury, J., A., Truss, L., T., Roy, B., Van, Glynn, P., W. 2006; 5 (1): 45-61
  • TD(0) Leads to Better Policies than Approximate Value Iteration Advances in Neural Information Processing Systems 18 Roy, B., Van MIT Press. 2006
  • Consensus Propagation Advances in Neural Information Processing Systems 18 Moallemi, C., C., Van Roy, B. MIT Press. 2006
  • Oblivious Equilibrium: A Mean Field Approximation for Large Scale Dynamic Games Advances in Neural Information Processing Systems 18 Weintraub, G., Y., Benkard, C., L., Roy, B., Van 2006
  • A Non-Parametric Approach to Multi-Product Pricing Operations Research Rusmevichientong, P., Van Roy, B., Glynn, P., W. 2006; 54 (1): 82-98
  • A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning Discrete Event Dynamic Systems Choi, D., S., Van Roy, B. 2006; 16 (2)
  • An approximate dynamic programming approach to decentralized control of stochastic systems CONTROL OF UNCERTAIN SYSTEMS: MODELLING, APPROXIMATION, AND DESIGN Cogill, R., Rotkowitz, M., Van Roy, B., Lall, S. 2006; 329: 243-256
  • A universal scheme for learning 2005 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), VOLS 1 AND 2 Farias, V. F., Moallemi, C. C., Van Roy, B., Weissman, T. 2005: 1158-1162
  • Solitaire: Man Versus Machine Advances in Neural Information Processing Systems 17 Yan, X., Diaconis, P., Rusmevichientong, P., Roy, B., Van MIT Press. 2005
  • A Linear Program for Bellman Error Minimization with Performance Guarantees Advances in Neural Information Processing Systems 17 de Farias, D., P., Van Roy, B. MIT Press. 2005
  • On constraint sampling in the linear programming approach to approximate dynamic programming MATHEMATICS OF OPERATIONS RESEARCH de Farias, D. P., Van Roy, B. 2004; 29 (3): 462-478
  • Making eigenvector-based reputation systems robust to collusion ALGORITHMS AND MODELS FOR THE WEB-GRAPHS, PROCEEDINGS Zhang, H., Goel, A., Govindan, R., Mason, K., Van Roy, B. 2004; 3243: 92-104
  • Approximate Dynamic Programming for High-Dimensional Dynamic Resource Allocation Problems in Handbook of Learning and Approximate Dynamic Programming Powell, W., B., Van Roy, B. edited by Si, J., Barto, A., G., Powell, W., B. Wiley-IEEE Press, Hoboken, NJ. 2004: 261-279
  • An Approximate Dynamic Programming Approach to Decentralized Control of Stochastic Systems Cogill, R., Rotkowitz, M., Roy, B., Van, Lall, S. 2004
  • Distributed optimization in adaptive networks ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16 Moallemi, C. C., Van Roy, B. 2004; 16: 887-894
  • The linear programming approach to approximate dynamic programming OPERATIONS RESEARCH de Farias, D. P., Van Roy, B. 2003; 51 (6): 850-865
  • Decentralized decision-making in a large team with local information GAMES AND ECONOMIC BEHAVIOR Rusmevichientong, P., Van Roy, B. 2003; 43 (2): 266-295
  • On constraint sampling in the linear programming approach to approximate linear programming 42ND IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-6, PROCEEDINGS de Farias, D. P., Van Roy, B. 2003: 2441-2446
  • Decentralized Protocols for Optimization of Sensor Networks Moallemi, C., C., Van Roy, B. 2003
  • Approximate Linear Programming for Average-Cost Dynamic Programming Advances in Neural Information Processing Systems 15 de Farias, D., P., Van Roy, B. MIT Press. 2003
  • Improving Eigenvector-Based Reputation Systems Against Collusion Workshop on Algorithms and Models for the Web Graph Zhang, H., Goel, A., Govindan, R., Mason, K., Roy, B., Van 2003
  • Book Review: Self-Learning Control of Finite Markov Chains, by A. S. Poznyak, K. Najim, and E. Gomez-Ramirez Automatica Roy, B., Van 2003; 39 (2): 373-376
  • On average versus discounted reward temporal-difference learning MACHINE LEARNING Tsitsiklis, J. N., Van Roy, B. 2002; 49 (2-3): 179-191
  • Approximate dynamic programming via linear programming ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2 de Farias, D. P., Van Roy, B. 2002; 14: 689-695
  • Algorithms for GPS Operation Indoors and Downtown GPS Solutions Agarwal, N., Basch, J., Beckmann, P., Bharti, P., Bloebaum, S., Casadei, S., Van Roy, B. 2002; 6 (3): 149-160
  • Regression methods for pricing complex American-Style options IEEE TRANSACTIONS ON NEURAL NETWORKS Tsitsiklis, J. N., Van Roy, B. 2001; 12 (4): 694-703

    Abstract

    We introduce and analyze a simulation-based approximate dynamic programming method for pricing complex American-style options, with a possibly high-dimensional underlying state space. We work within a finitely parameterized family of approximate value functions, and introduce a variant of value iteration, adapted to this parametric setting. We also introduce a related method which uses a single (parameterized) value function, which is a function of the time-state pair, as opposed to using a separate (independently parameterized) value function for each time. Our methods involve the evaluation of value functions at a finite set, consisting of "representative" elements of the state space. We show that with an arbitrary choice of this set, the approximation error can grow exponentially with the time horizon (time to expiration). On the other hand, if representative states are chosen by simulating the state process using the underlying risk-neutral probability distribution, then the approximation error remains bounded.

    View details for Web of Science ID 000169875300005

    View details for PubMedID 18249905

  • An Analysis of Belief Propagation on the Turbo Decoding Graph with Gaussian Densities IEEE Transactions on Information Theory Rusmevichientong, P., Van Roy, B. 2001; 47 (2): 745-765
  • Neuro-Dynamic Programming: Overview and Recent Trends in Handbook of Markov Decision Processes: Methods and Applications Van Roy, B. edited by Feinberg, E., Shwartz, A. Kluwer. 2001
  • A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning Choi, D., S., Van Roy, B. 2001
  • A Tractable POMDP for a Class of Sequencing Problems Rusmevichientong, P., Van Roy, B. 2001
  • On the existence of fixed points for approximate value iteration and temporal-difference learning JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS de Farias, D. P., Van Roy, B. 2000; 105 (3): 589-608
  • The optimal harvesting of environmental bads PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5 Keohane, N., Van Roy, B., Zeckhauser, R. 2000: 234-239
  • Fixed Points for Approximate Value Iteration and Temporal-Difference Learning de Farias, D., P., Van Roy, B. 2000
  • Approximate value iteration with randomized policies PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5 de Farias, D. P., Van Roy, B. 2000: 3421-3426
  • Temporal-difference learning and applications in finance COMPUTATIONAL FINANCE 1999 Van Roy, B. 2000: 447-461
  • Approximate value iteration and temporal-difference learning IEEE 2000 ADAPTIVE SYSTEMS FOR SIGNAL PROCESSING, COMMUNICATIONS, AND CONTROL SYMPOSIUM - PROCEEDINGS de Farias, D. P., Van Roy, B. 2000: 48-51
  • An analysis of turbo decoding with Gaussian densities ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12 Rusmevichientong, P., Van Roy, B. 2000; 12: 575-581
  • Average cost temporal-difference learning AUTOMATICA Tsitsiklis, J. N., Van Roy, B. 1999; 35 (11): 1799-1808
  • Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives IEEE TRANSACTIONS ON AUTOMATIC CONTROL Tsitsiklis, J. N., Van Roy, B. 1999; 44 (10): 1840-1851
  • Optimal Stopping of Markov Processes: Hilbert Space Theory, Approximation Algorithms, and an Application to Pricing High-Dimensional Financial Derivatives IEEE Transactions on Automatic Control Tsitsiklis, J., N., Van Roy, B. 1999; 44 (10): 1840-1851
  • An analysis of temporal-difference learning with function approximation IEEE TRANSACTIONS ON AUTOMATIC CONTROL Tsitsiklis, J. N., VANROY, B. 1997; 42 (5): 674-690
  • Neuro-dynamic programming overview and a case study in optimal stopping PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5 Tsitsiklis, J. N., Van Roy, B. 1997: 1181-1186
  • A neuro-dynamic programming approach to retailer inventory management PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5 Van Roy, B., Bertsekas, D. P., Lee, Y., Tsitsiklis, J. N. 1997: 4052-4057
  • Overview of Neuro-Dynamic Programming and a Case Study in Optimal Stopping Tsitsiklis, J., N., Van Roy, B. 1997
  • A Neuro-Dynamic Programming Approach to Retailer Inventory Management Van Roy, B., Bertsekas, D., P., Lee, Y., Tsitsiklis, J., N. 1997
  • Solving Data Mining Problems Through Pattern Recognition Prentice-Hall Kennedy, R., Lee, Y., Roy, B., Van, Reed, C., Lippman, R. 1997
  • Analysis of temporal-difference learning with function approximation Tsitsiklis, J. N., VANROY, B. M I T PRESS. 1997: 1075-1081
  • Average cost temporal-difference learning PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5 Tsitsiklis, J. N., Van Roy, B. 1997: 498-502
  • Approximate solutions to optimal stopping problems Tsitsiklis, J. N., VANROY, B. M I T PRESS. 1997: 1082-1088
  • Feature-based methods for large scale dynamic programming MACHINE LEARNING Tsitsiklis, J. N., VANROY, B. 1996; 22 (1-3): 59-94
  • Stable linear approximations to dynamic programming for stochastic control problems with local transitions VANROY, B., Tsitsiklis, J. N. M I T PRESS. 1996: 1045-1051
  • Feature-based methods for large scale dynamic programming Tsitsiklis, J. N., VANROY, B. IEEE. 1995: 565-567
  • Solving Pattern Recognition Problems Unica Kennedy, R., Lee, Y., Reed, C., Van Roy, B. 1995