{"id":"https://api.bcdiploma.com/badges/assert/0747899BF0F338B0DF00B6764FBFE88C6D64E48A7B390CB75EF2C98CBFB82051U01GZ3ZDbzBnTENMcnhwdGJBRHdmN01hZER6UEpoZ2N6RVUzOEhxdGl1NkxXalEx","@context":"https://w3id.org/openbadges/v2","type":"Assertion","issuedOn":"2025-11-10T00:00:00Z","badge":{"id":"https://api.bcdiploma.com/badges/issuer/176/template/1x12F","@context":"https://w3id.org/openbadges/v2","type":"BadgeClass","@language":"en","issuer":{"id":"https://api.bcdiploma.com/badges/issuer/176","@context":"https://w3id.org/openbadges/v2","type":"Issuer","name":"Stanford Online","url":"https://online.stanford.edu/","image":"https://www.bcdiploma.com/img/issuers/stanford.png","description":"Stanford University publicly declares the use of its blockchain address and BCdiploma technology to produce certified credentials. These credentials can be consulted via a url link which provides all proofs of authenticity."},"name":"XCS234 - Reinforcement Learning","description":"Reinforcement learning algorithms are applicable to a wide range of tasks, including robotics, game playing, consumer modeling, and healthcare. In this course, learners gain a strong foundation in reinforcement learning through lectures, written and coding assignments. They explore core methods such as Markov decision processes (MDPs), Monte Carlo methods, and deep reinforcement learning. Learners also develop skills in advanced approaches like RL from human feedback (RLHF), Direct Preference Optimization (DPO), and offline RL, while addressing challenges like exploration vs. exploitation and algorithm evaluation using criteria such as regret and sample complexity.","image":"https://ipfs2.bcdiploma.com/ipfs/QmfDuPeQmc1uUAoYZW4rwptEqjV1hjzTzthRLu1ok92fbK","criteria":{"id":"https://api.bcdiploma.com/badges/issuer/176/template/1x12F","narrative":"<b>Certificate of Achievement in Reinforcement Learning</b> verified by the Stanford Engineering Center for Global & Online Education.<br>\n<span style=\"color: #8c1515; font-family: SourceB;\">Grade: Satisfactory&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;  CEU-equivalent: 10.0</span>\n\n<a href=\"https://online.stanford.edu/grades-and-units\">Grades and Units Information</a>\n<a href=\"https://online.stanford.edu/digital-credential\">Digital Credential Information</a>"},"tags":["Markov Decision Processes & Planning","Model-free Policy Evaluation","Model-free Control","Policy Search","Offline Reinforcement Learning","RL from Human Feedback","Direct Preference Optimization","Fast Learning / Data Efficiency","Exploration","Evaluation Metrics"]},"recipient":{"type":"email","identity":"sha256$85e519cbbb242e887c840651225bc1994ad08a578a843c1c2bdcbb151cb5129a","hashed":true},"verification":{"type":"HostedBadge"},"evidence":{"id":"https://digitalcredential.stanford.edu/check/0747899BF0F338B0DF00B6764FBFE88C6D64E48A7B390CB75EF2C98CBFB82051U01GZ3ZDbzBnTENMcnhwdGJBRHdmN01hZER6UEpoZ2N6RVUzOEhxdGl1NkxXalEx"}}