Share this Job

Date: May 7, 2018

Location: New York, NY, US, 10019

Year: 2018

Company: Penguin Random House LLC 

Requisition ID: 19421 

The Data Science & Analytics group at Penguin Random House is seeking a Data Scientist.


We are an agile team of data scientists and software engineers. The team has a wide mandate encompassing pricing systems, recommendation / personalization systems, title segmentation, supply chain, as well as ad-hoc analysis and data exploration.


In this role, you will have an opportunity to work on a variety of high-profile projects under the mentorship of Senior Data Scientists and in collaboration with key decision makers across the organization.


Please include with your application a link to your GitHub (Bitbucket) repository for a code sample, whether it was for a Kaggle attempt, a school project, or a general open-source contribution. Standalone code samples will also be accepted.
Your profile:
Apply if you have:

  • A bachelor’s degree in mathematics, statistics, economics, computer science, business analytics, or any quantitative social science
  • Relevant coursework applying advanced statistical/machine learning and predictive analysis techniques
  • Intuition for mapping real word problems to relevant analytical methods, models, approaches
  • Solid capability in SQL for tasks such as computing aggregates and joining multiple tables
  • Expertise in at least one scientific computing / scripting tool, such as R or Python
  • A strong, documented desire to rapidly and continually advance skills through on-the-job and off-the-job training (e.g. via MOOCs) 


Preferred qualifications:

  • 2 years of professional experience in a data science role
  • Experience working with Python packages such as scikit-learn, pandas, or TensorFlow
  • Alternatively, a good understanding of R packages such as ggplot2, rCharts, ri, dplyr, data.table, cvTools, (b)lmer, arm, lasso/glmnet, BayesTree and reshape2/tidyr
  • Experience with Stan or other general-purpose modeling tools
  • Experience extracting data from APIs
  • Experience with UX design and data visualization
  • Experience building data products from the warehouse ingestion phase all the way through to the business-facing application side
  • Experience with automated feature engineering and large datasets (>1TB)


Penguin Random House is the leading adult and children’s publishing house in North America, the United Kingdom and many other regions around the world.  In publishing the best books in every genre and subject for all ages, we are committed to quality, excellence in execution, and innovation throughout the entire publishing process: editorial, design, marketing, publicity, sales, production, and distribution.  Our vibrant and diverse international community of nearly 250 publishing brands and imprints include Ballantine Bantam Dell, Berkley, Clarkson Potter, Crown, DK, Doubleday, Dutton, Grosset & Dunlap, Little Golden Books, Knopf, Modern Library, Pantheon, Penguin Books, Penguin Press, Penguin Random House Audio, Penguin Young Readers, Portfolio, Puffin, Putnam, Random House, Random House Children’s Books, Riverhead, Ten Speed Press, Viking, and Vintage, among others.  More information can be found at


Penguin Random House values the array of talents and perspectives that a diverse workforce brings. All qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status.

Nearest Major Market: Manhattan
Nearest Secondary Market: New York City

Job Segment: Database, Scientific, Training, Scientist, Engineer, Engineering, Education, Technology, Science