Decision Makers, managers and executives must understand technical people to make decisions.
Courses in this category focus on understanding methodologies, technologies and apply sciences for the sole purpose of decision making.
If you try to make sense out of the data you have access to or want to analyse unstructured data available on the net (like Twitter, Linked in, etc...) this course is for you.
It is mostly aimed at decision makers and people who need to choose what data is worth collecting and what is worth analyzing.
It is not aimed at people configuring the solution, those people will benefit from the big picture though.
During the course delegates will be presented with working examples of mostly open source technologies.
Short lectures will be followed by presentation and simple exercises by the participants
Content and Software used
All software used is updated each time the course is run so we check the newest versions possible.
It covers the process from obtaining, formatting, processing and analysing the data, to explain how to automate decision making process with machine learning.
Structured vs unstructured
Static vs streamed
Attitudinal, behavioural and demographic data
Data-driven vs user-driven analytics
Volume, velocity and variety of data
kGroups, k-means, nearest neighbours
Ant colonies, birds flocking
Support vector machine
Naive Bayes classification
Cost of software
Cost of development
Data Preparation (MapReduce)
Model deployment and integration
Overview of Open Source and commercial software
Selection of R-project package
Hadoop and Mahout
Selected Apache projects related to Big Data and Analytics
Selected commercial solution
Integration with existing software and data sources
This course has been created for decision makers whose primary goal is not to do the calculation and the analysis, but to understand them and be able to choose what kind of statistical methods are relevant in strategic planning of the organization.
For example, a prospect participant needs to make decision how many samples needs to be collected before they can make the decision whether the product is going to be launched or not.
If you need longer course which covers the very basics of statistical thinking have a look at 5 day "Statistics for Managers" training.
What statistics can offer to Decision Makers
Basic statistics - which of the statistics (e.g. median, average, percentiles etc...) are more relevant to different distributions
Graphs - significance of getting it right (e.g. how the way the graph is created reflects the decision)
Variable types - what variables are easier to deal with
Ceteris paribus, things are always in motion
Third variable problem - how to find the real influencer
Probability value - what is the meaning of P-value
Repeated experiment - how to interpret repeated experiment results
Data collection - you can minimize bias, but not get rid of it
Understanding confidence level
Decision making with limited information
how to check how much information is enough
prioritizing goals based on probability and potential return (benefit/cost ratio ration, decision trees)
How errors add up
What is Schrödinger's cat and what is Newton's Apple in business
Cassandra Problem - how to measure a forecast if the course of action has changed
Google Flu trends - how it went wrong
How decisions make forecast outdated
Forecasting - methods and practicality
Why naive forecasts are usually more responsive
How far a forecast should look into the past?
Why more data can mean worse forecast?
Statistical Methods useful for Decision Makers
Describing Bivariate Data
Univariate data and bivariate data
why things differ each time we measure them?
Normal Distributions and normally distributed errors
Independent sources of information and degrees of freedom
Logic of Hypothesis Testing
What can be proven, and why it is always the opposite what we want (Falsification)
Interpreting the results of Hypothesis Testing
How to determine a good (and cheap) sample size
False positive and false negative and why it is always a trade-off
This course is created for people who face choices which solution to choose for a specific problem. IT Managers, Solution Architects, Test Managers, System Administrators and Developers can benefit from this course by understanding the benefits and costs of available Cloud/SaaS/Iaas solutions.
Overview of Cloud
Virtalization (e.g. VirtualBox, WMware, KVM...)
Hardware support for virtalization (sharing networki interfaces, etc...)
Share nothing storage (S3, Ceph, Glacier)
Mixed model (Bare Metal + Cloud)
Public Cloud Providers
Private Cloud Solutions
Software as a Service
Benefits over deployable software
Legal aspects influencing solution
Managing upgrades, versionsing, etc...
Deployment options (BeanStalk, etc...)
NoSQL (e.g. MongoDB)
SQL/NewSQL (e.g. Galera Cluster)
Automate redundancy management with RDS
Pros vs Cons
Dealing with transactioons and consistency
DNS load balacing (roundrobin, geo-proximity, etc..., e.g. Route53)
Virtual Image Management (Appliances)
Transfering images between zones
Image interoperability between clouds