Posted by & filed under r programming, r-bloggers.

It is our pleasure to once again offer the intensive R beginner level course for the third time! Beginning this Sunday, the 35 hour course will walk you through the basic operations and characteristics of R, all the way to having a firm understanding of data manipulation and visualization.Also launching this weekend are two brand new courses, Data Visualization for D3 and Data Science for Python, both for the beginner level.

R users will rule the world :) Make sure to sign up today

Taught by preeminent data scientists in New York City, these beginner NYC Data Science Academy courses are the best introduction to the exciting world of R, open data, and statistical science.

If interested, please read the course descriptions below and RSVP today!

“NYC Data Science Academy provided me great exposure to data science topics that I haven’t come across in either school or previous jobs. The hands-on assignments are practical and make use of real-world examples. As product development is becoming more data-driven, it will be crucial for product teams to have a solid grasp of data analysis which NYC Data Science Academy fills the knowledge/skill gap.”

Donald Fleurantin on Feb 4, 2014

“I attended the beginner’s workshop for R and I found it extremely useful. The classes were very well organized. The slides were well paced with many practical examples. I especially like the hands on format of the class, you work through the slides on your laptop. I had very little knowledge of R before and I learned many tools during the course. I was particularly interested in the visualization tools. Since the course, I have used some of the charting tools that I learned in my presentations at work as well. Both Scott and Vivian did an excellent job teaching R basics. They were very helpful and answered questions in person, email and piazza (online platform where we would post our solutions). Vivian also shared with the class a lot of material and practical examples. I would highly recommend this course to users who are interested in learning R. ”

Heena D on Feb 26, 2014.

“The Introductory R class covered a broad range of information, and for a statistics and programming newbie like me, was indispensable for coming up to speed on a variety of related subject matter. Vivian is passionate about R, open data, statistics, etc.. Her enthusiasm is contagious! ”

— Jasna on Jan 28, 2014.

1. Data Science by R programming(Beginne­r level) R003

Dates: Mar 16th, 23th, 30th, April 6th,13th (five Sundays)

Time: 10:00am-5:00pm

Instructors: Vivian Zhang (CTO @Supstat Inc, Master degrees in Computer Science and Statistics)

Cost: $220 per class or $1100 for all five classes.

Note: NYC Data Academy does not offer individual classes. For group(5 or more persons) and enterprise pricing, please email vivian.zhang@supstat.com

Refund Policy: We offer a full refund if you are not happy with the first class and wish to drop the course.

RSVP: Data Science by R programming(Beginner level, Five Sun) R003

 Teaching on Centre St.

Course Outline:

(Content may be adjusted based on the real teaching condition)

  1. Basics: 12 hours

  • Abstract: Explain the basic operation of knowledge through this unit of study. Students will learn the characteristics of R, resource acquisition mode, and mastery of basic programming

  • Case Study and Exercise: Using the R language completion of certain Euler Project (euler project)

 

    • How to learn R

    • How to get help

    • R language resources and books

    • RStudio

    • Expansion Pack

    • Workspace

    • Custom Startup Items

    • Batch Mode

    • Data Objects

    • Custom Functions

    • Control Statements

    • Vectorized Operations

 

  1. Getting Data: 6 hours

  • Abstract: Explain the various ways the R language reads data, bring the participants through basic knowledge of web crawling, and connect to the database via sql statement calling data from a variety of locally read excel file data.

  • Case and Exercise: Crawl watercress data on the site and write a custom function.

    • Web data capture

    • API data source

    • Connect to the database

    • Local Documentation

    • Other data sources

    • Data Export

 

  1. Data Manipulation: 6 hours

  • Abstract: How to manipulate data and use R for the all kinds of data conversion, especially for string operation processing .

  • Case Study and Exercise: Find the QQ(the most used instant messenger tool) group, then discuss research options with text features.

    • Data sorting

    • Merge Data

    • Summary data

    • Remodeling Data

    • Take a subset of data

    • String manipulation

    • Date Actions

 

  1. Data Visualization: 6 hours

  • Abstract: Cover two advanced drawing packages (Lattice and ggplot2) and understand the various methods of visualization.

  • Case and Exercise: Using graphics, text and other data

    • Histogram

    • Point

    • Column

    • Line

    • Pie

    • Box Plot

    • Scatter

    • Matrix related

    • Map

 

Note: If class finishes early, we will cover selected topics below based on your need

 

  1. Elementary Statistical Methods:

  • Abstract: The primary explanation to use R for statistical analysis and regression analysis. Students will master the basic statistical significance and role model.

  • Case and Exercise: Using regression to predict commodity prices―simulated casino game winner.

 

    • Descriptive Statistics

    • Statistical Distributions

    • Frequency and contingency tables

    • Correlation

    • T test

    • Non-parametric statistics

    • Linear Regression

    • Regression Diagnostics

    • Robust Regression

    • Nonlinear regression

    • Principal Component Analysis

    • Logistic Regression

    • Statistical Simulation

 

  1. Preliminary Data Mining:

  • Abstract: Explain the R language for data mining expansion pack and functions use. Students will master two mining methods, supervised learning and unsupervised learning.

  • Case and Exercise: Use R to participate in Kaggle Data Mining Competition

    • General Mining Process

    • Rattle bag

    • Hierarchical clustering

    • K -means clustering

    • Decision Trees

    • BP neural network

 

2. Data Visualization for D3.js (Beginner Level) D001

Date: Mar 15th, 22th, 29th, April 5th,12th(Five Saturdays)

Time:  9:00am-1:00pm

Instructor: Adam Pearce is a Data Interaction Developer at Quovo, a web-based investment data analytics and visualization platform. He is one of the top Stack Overflow D3 experts and his work has been featured in The Atlantic Cities, Visualizing.org, visual.ly, and VisualLoop.

Note: NYC Data Academy does not offer individual classes. For group(5 or more persons) and enterprise pricing, please email vivian.zhang@supstat.com

Cost: $850 per person

Refund Policy: We offer a full refund if you are not happy with the first class and wish to drop the course.

RSVP:  Data Visualization by D3.js (Beginner level,Five Sat) D001

Course Outline:

(Content may be adjusted based on the real teaching condition)

  1. Week 1
  • Basic Building Blocks

    • Why D3

    • HTML

    • CSS

    • SVG

    • Javascript

    • Chrome Dev Tools

  • Scatter Plot

    • Selections

    • Appending

    • Data Binding

    • Selections

    • Appending

    • Data Binding

  1. Week 2
  • Sprucing Things Up

    • Margin conventions

    • Scales

    • Axes

    • Loading data

  • Bar chart

    • Interaction

    • Transitions

    • Nested data- Grouped bar chart- Stacked

 

  1. Week 3
  • Line chart

    • SVG Paths

    • Area and Line generators-

    • Time formatting

    • Brushing

  • Reusable charts

    • Closures

    • Sparklines

    • Responsive design

    • d3.dispatch

  1. Week 4

  • Advanced Javascript

    • Functional programing

    • D3 & Arrays

    • Interactive sparkline

    • d3 nest

    • Piecharts

    • Crossfilter

  1. Week 5

  • Mapping

    • Choropleth

    • Zoom and pan

    • Projections

    • topojosn

 

We also offer in-depth workshops on real work projects, such as New Yorker Subway income visualization: http://www.newyorker.com/sandbox/business/subway.html

 

3.Data Science by Python(Beginner Level) P001

Date: Classes will be offered on Mar 15th, 22th, 29th, April 5th,12th(Five Saturdays)

Time: 1:15-5:15pm

Instructor: John Downs is a software engineer here in NYC. John is Data Science enthusiast and an expert in Python and Clojure. John’s experience ranges from use in Python, C/C++, Clojure, Java, Javascript and Matlab.

Cost: $850 per person

Note: NYC Data Academy does not offer individual classes. For group(5 or more persons) and enterprise pricing, please email vivian.zhang@supstat.com

Refund Policy: We offer a full refund if you are not happy with the first class and wish to drop the course.

RSVP: Data Science by Python(Beginner level) P001

 

Course Outline:

(Content may be adjusted based on the real teaching condition)

  1. Week I: An introduction to Python

Reading: Think Python CH 2, 3, 5, 6, 7, 8, 10-15

http://www.greenteapress.com/thinkpython/html/index.html

    • basic syntax

    • conditionals

    • iteration

    • functions

    • data structures

    • classes

 

  1. Week II: Python Standard Library and Computational Statistics

Reading: Section 9 of the Python standard library http://docs.python.org/2/library/

Think Stats CH 2, 4-9http://www.greenteapress.com/thinkstats/html/index.html

    • Python standard library

    • regular expressions

    • datetime

    • random

    • itertools

    • functools

    • math

    • Computational statistics

    • descriptive statistics

    • probability distributions

    • hypothesis testing

    • correlation

  1. Week III: Visualization and Exploratory Data Analysis

Reading: Python for Data Analysis CH 5, 7, 9, 10

    • Visualization with Matplotlib

    • histograms

    • line charts

    • scatterplots

    • pie charts

    • boxplots

    • animation

    • subplots

    • Exploratory data analysis with Pandas

    • Pandas data structures

    • Handling missing data

    • Merging, aggregating and transforming data

    • Sampling

    • Time series

  1. Week IV: A gentle introduction to scientific computing and machine learning

Reading: Python for Data Analysis CH 4, 11

Doing Data Science: CH 3-5

Optional: Learning Scikit-Learn

    • Numpy

    • Linear algebra

    • Random numbers

    • Testing with bumpy

    • Introduction to Scikit-learn

    • K-Nearest Neighbors

    • K Means

    • Naive-Bayes

    • Logistic Regression

    • Linear Regression

  1. Week V: Building data product

Reading: Doing Data Science CH 8-9

    • Using web APIs

    • requests library

    • web scraping

    • Databases – pymongo

    • Building a web application with Flask