ref material to use later was very good.
PAUL BEALES- Seagate Technology.
Predictive Analytics courses
|bigdatar||Programming with Big Data in R||21 hours||Introduction to Programming Big Data with R (bpdR) Setting up your environment to use pbdR Scope and tools available in pbdR Packages commonly used with Big Data alongside pbdR Message Passing Interface (MPI) Using pbdR MPI 5 Parallel processing Point-to-point communication Send Matrices Summing Matrices Collective communication Summing Matrices with Reduce Scatter / Gather Other MPI communications Distributed Matrices Creating a distributed diagonal matrix SVD of a distributed matrix Building a distributed matrix in parallel Statistics Applications Monte Carlo Integration Reading Datasets Reading on all processes Broadcasting from one process Reading partitioned data Distributed Regression Distributed Bootstrap|
|kdd||Knowledge Discover in Databases (KDD)||21 hours||Knowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. Real-life applications for this data mining technique include marketing, fraud detection, telecommunication and manufacturing. In this course, we introduce the processes involved in KDD and carry out a series of exercises to practice the implementation of those processes. Audience Data analysts or anyone interested in learning how to interpret data to solve problems Format of the course After a theoretical discussion of KDD, the instructor will present real-life cases which call for the application of KDD to solve a problem. Participants will prepare, select and cleanse sample data sets and use their prior knowledge about the data to propose solutions based on the results of their observations. Introduction KDD vs data mining Establishing the application domain Establishing relevant prior knowledge Understanding the goal of the investigation Creating a target data set Data cleaning and preprocessing Data reduction and projection Choosing the data mining task Choosing the data mining algorithms Interpreting the mined patterns|
|datamodeling||Pattern Recognition||35 hours||This course provides an introduction into the field of pattern recognition and machine learning. It touches on practical applications in statistics, computer science, signal processing, computer vision, data mining, and bioinformatics. The course is interactive and includes plenty of hands-on exercises, instructor feedback, and testing of knowledge and skills acquired. Audience Data analysts PhD students, researchers and practitioners Introduction Probability theory, model selection, decision and information theory Probability distributions Linear models for regression and classification Neural networks Kernel methods Sparse kernel machines Graphical models Mixture models and EM Approximate inference Sampling methods Continuous latent variables Sequential data Combining models|
|d2dbdpa||From Data to Decision with Big Data and Predictive Analytics||21 hours||Audience If you try to make sense out of the data you have access to or want to analyse unstructured data available on the net (like Twitter, Linked in, etc...) this course is for you. It is mostly aimed at decision makers and people who need to choose what data is worth collecting and what is worth analyzing. It is not aimed at people configuring the solution, those people will benefit from the big picture though. Delivery Mode During the course delegates will be presented with working examples of mostly open source technologies. Short lectures will be followed by presentation and simple exercises by the participants Content and Software used All software used is updated each time the course is run so we check the newest versions possible. It covers the process from obtaining, formatting, processing and analysing the data, to explain how to automate decision making process with machine learning. Quick Overview Data Sources Minding Data Recommender systems Target Marketing Datatypes Structured vs unstructured Static vs streamed Attitudinal, behavioural and demographic data Data-driven vs user-driven analytics data validity Volume, velocity and variety of data Models Building models Statistical Models Machine learning Data Classification Clustering kGroups, k-means, nearest neighbours Ant colonies, birds flocking Predictive Models Decision trees Support vector machine Naive Bayes classification Neural networks Markov Model Regression Ensemble methods ROI Benefit/Cost ratio Cost of software Cost of development Potential benefits Building Models Data Preparation (MapReduce) Data cleansing Choosing methods Developing model Testing Model Model evaluation Model deployment and integration Overview of Open Source and commercial software Selection of R-project package Python libraries Hadoop and Mahout Selected Apache projects related to Big Data and Analytics Selected commercial solution Integration with existing software and data sources|
|apachemdev||Apache Mahout for Developers||14 hours||Audience Developers involved in projects that use machine learning with Apache Mahout. Format Hands on introduction to machine learning. The course is delivered in a lab format based on real world practical use cases. Implementing Recommendation Systems with Mahout Introduction to recommender systems Representing recommender data Making recommendation Optimizing recommendation Clustering Basics of clustering Data representation Clustering algorithms Clustering quality improvements Optimizing clustering implementation Application of clustering in real world Classification Basics of classification Classifier training Classifier quality improvements|
|appliedml||Applied Machine Learning||14 hours||This training course is for people that would like to apply Machine Learning in practical applications. Audience This course is for data scientists and statisticians that have some familiarity with statistics and know how to program R (or Python or other chosen language). The emphasis of this course is on the practical aspects of data/model preparation, execution, post hoc analysis and visualization. The purpose is to give practical applications to Machine Learning to participants interested in applying the methods at work. Sector specific examples are used to make the training relevant to the audience. Naive Bayes Multinomial models Bayesian categorical data analysis Discriminant analysis Linear regression Logistic regression GLM EM Algorithm Mixed Models Additive Models Classification KNN Bayesian Graphical Models Factor Analysis (FA) Principal Component Analysis (PCA) Independent Component Analysis (ICA) Support Vector Machines (SVM) for regression and classification Boosting Ensemble models Neural networks Hidden Markov Models (HMM) Space State Models Clustering|
|Piwik||Getting started with Piwik||21 hours||Audience Web analysist Data analysists Market researchers Marketing and sales professionals System administrators Format of course Part lecture, part discussion, heavy hands-on practice Introduction to Piwik Why use Piwik? Piwik vs Google Analystics Setting up Piwik Selecting which websites to monitor Working with the dashboard Understanding visitor activity Actions Referrals Generating reports|
|predmodr||Predictive Modelling with R||14 hours||Problems facing forecasters Customer demand planning Investor uncertainty Economic planning Seasonal changes in demand/utilization Roles of risk and uncertainty Time series Forecasting Seasonal adjustment Moving average Exponential smoothing Extrapolation Linear prediction Trend estimation Stationarity and ARIMA modelling Econometric methods (casual methods) Regression analysis Multiple linear regression Multiple non-linear regression Regression validation Forecasting from regression Judgemental methods Surveys Delphi method Scenario building Technology forecasting Forecast by analogy Simulation and other methods Simulation Prediction market Probabilistic forecasting and Ensemble forecasting|
|matlabdsandreporting||MATLAB Fundamentals, Data Science & Report Generation||126 hours||In the first part of this training, we cover the fundamentals of MATLAB and its function as both a language and a platform. Included in this discussion is an introduction to MATLAB syntax, arrays and matrices, data visualization, script development, and object-oriented principles. In the second part, we demonstrate how to use MATLAB for data mining, machine learning and predictive analytics. To provide participants with a clear and practical perspective of MATLAB's approach and power, we draw comparisons between using MATLAB and using other tools such as spreadsheets, C, C++, and Visual Basic. In the third part of the training, participants learn how to streamline their work by automating their data processing and report generation. Throughout the course, participants will put into practice the ideas learned through hands-on exercises in a lab environment. By the end of the training, participants will have a thorough grasp of MATLAB' capabilities and will be able to employ it for solving real-world data science problems as well as for streamlining their work through automation. Assessments will be conducted throughout the course to guage progress. Format of the course Course includes theoretical and practical exercises, including case discussions, sample code inspection, and hands-on implementation. Note Practice sessions will based on pre-arranged sample data report templates. If you have specific requirements, please contact us to arrange Introduction MATLAB for data science and reporting Part 01: MATLAB fundamentals Overview MATLAB for data analysis, visualization, modeling, and programming. Working with the MATLAB user interface Overview of MATLAB syntax Entering commands Using the command line interface Creating variables Numeric vs character data Analyzing vectors and matrices Creating and manipulating Performing calculations Visualizing vector and matrix data Working with data files Importing data from Excel spreadsheets Working with data types Working with table data Automating commands with scripts Creating and running scripts Organizing and publishing your scripts Writing programs with branching and loops User interaction and flow control Writing functions Creating and calling functions Debugging with MATLAB Editor Applying object-oriented programming principles to your programs Part 02: MATLAB for data science Overview MATLAB for data mining, machine learning and predictive analytics Accessing data Obtaining data from files, spreadsheets, and databases Obtaining data from test equipment and hardware Obtaining data from software and the Web Exploring data Identifying trends, testing hypotheses, and estimating uncertainty Creating customized algorithms Creating visualizations Creating models Publishing customized reports Sharing analysis tools As MATLAB code As standalone desktop or Web applications Using the Statistics and Machine Learning Toolbox Using the Neural Network Toolbox Part 03: Report generation Overview Presenting results from MATLAB programs, applications, and sample data Generating Microsoft Word, PowerPoint®, PDF, and HTML reports. Templated reports Tailor-made reports Using organization’s templates and standards Creating reports interactively vs programmatically Using the Report Explorer Using the DOM (Document Object Model) API Creating reports interactively using Report Explorer Report Explorer Examples Magic Squares Report Explorer Example Creating reports Using Report Explorer to create report setup file, define report structure and content Formatting reports Specifying default report style and format for Report Explorer reports Generating reports Configuring Report Explorer for processing and running report Managing report conversion templates Copying and managing Microsoft Word , PDF, and HTML conversion templates for Report Explorer reports Customizing Report Conversion templates Customizing the style and format of Microsoft Word and HTML conversion templates for Report Explorer reports Customizing components and style sheets Customizing report components, define layout style sheets Creating reports programmatically in MATLAB Template-Based Report Object (DOM) API Examples Functional report Object-oriented report Programmatic report formatting Creating report content Using the Document Object Model (DOM) API Report format basics Specifying format for report content Creating form-based reports Using the DOM API to fill in the blanks in a report form Creating object-oriented reports Deriving classes to simplify report creation and maintenance Creating and formatting report objects Lists, tables, and images Creating DOM Reports from HTML Appending HTML string or file to a Microsoft® Word, PDF, or HTML report generated by Document Object Model (DOM) API Creating report templates Creating templates to use with programmatic reports Formatting page layouts Formatting pages in Microsoft Word and PDF reports Summary and closing remarks|
|intror||Introduction to R with Time Series Analysis||21 hours||Introduction and preliminaries Making R more friendly, R and available GUIs Rstudio Related software and documentation R and statistics Using R interactively An introductory session Getting help with functions and features R commands, case sensitivity, etc. Recall and correction of previous commands Executing commands from or diverting output to a file Data permanency and removing objects Simple manipulations; numbers and vectors Vectors and assignment Vector arithmetic Generating regular sequences Logical vectors Missing values Character vectors Index vectors; selecting and modifying subsets of a data set Other types of objects Objects, their modes and attributes Intrinsic attributes: mode and length Changing the length of an object Getting and setting attributes The class of an object Arrays and matrices Arrays Array indexing. Subsections of an array Index matrices The array() function The outer product of two arrays Generalized transpose of an array Matrix facilities Matrix multiplication Linear equations and inversion Eigenvalues and eigenvectors Singular value decomposition and determinants Least squares fitting and the QR decomposition Forming partitioned matrices, cbind() and rbind() The concatenation function, (), with arrays Frequency tables from factors Lists and data frames Lists Constructing and modifying lists Concatenating lists Data frames Making data frames attach() and detach() Working with data frames Attaching arbitrary lists Managing the search path Data manipulation Selecting, subsetting observations and variables Filtering, grouping Recoding, transformations Aggregation, combining data sets Character manipulation, stringr package Reading data Txt files CSV files XLS, XLSX files SPSS, SAS, Stata,… and other formats data Exporting data to txt, csv and other formats Accessing data from databases using SQL language Probability distributions R as a set of statistical tables Examining the distribution of a set of data One- and two-sample tests Grouping, loops and conditional execution Grouped expressions Control statements Conditional execution: if statements Repetitive execution: for loops, repeat and while Writing your own functions Simple examples Defining new binary operators Named arguments and defaults The '...' argument Assignments within functions More advanced examples Efficiency factors in block designs Dropping all names in a printed array Recursive numerical integration Scope Customizing the environment Classes, generic functions and object orientation Graphical procedures High-level plotting commands The plot() function Displaying multivariate data Display graphics Arguments to high-level plotting functions Basic visualisation graphs Multivariate relations with lattice and ggplot package Using graphics parameters Graphics parameters list Time series Forecasting Seasonal adjustment Moving average Exponential smoothing Extrapolation Linear prediction Trend estimation Stationarity and ARIMA modelling Econometric methods (casual methods) Regression analysis Multiple linear regression Multiple non-linear regression Regression validation Forecasting from regression|