The mystery and influence of P. Value (also known as the p-value) have made it the most popular celebrity calculation of our time — and maybe also the most misunderstood. Despite “significant” starring roles in thousands of data analyses, many still find P. Value mystifying, or even misleading.
So who is P. Value, really? Our interviewer sat down with P. Value for an exclusive Q&A to hear its origin story and find out why its success isn’t just due to chance. And, yes, we asked the tough questions: You’ll learn the truth behind that “p-hacking” controversy you’ve heard about.
Read…
At the doctor’s office, you and the medical assistant go through a familiar routine before the doctor arrives. They’ll check your vital signs — pulse, blood pressure, respiration rate and more — and collect some general information. Those steps summarize some important aspects of your health, letting the doctor jump right into more complex analysis.
The Data Health Tool, now included in the Alteryx Intelligence Suite, does something similar for your data. It gives you a quick but thorough assessment of a dataset’s readiness for further analysis, especially prior to predictive analytics and machine learning. …
Ready for a refresher on your pandas skills — or just ready for a refreshing drink?
Pandas, the widely used Python library for data analysis and data wrangling, has an incredible variety of useful functions. If you’re new to pandas or just want to practice, this Data-Driven Cocktail Challenge will help you gain familiarity with indexing, utility functions, string functions, and more.
Be sure to refer to the pandas documentation for help if you need it, and, of course, Google and Stack Overflow are your friends, too.
If you want to be extra Pythonic, try to solve each step in…
We have a special audio treat this week: a bonus episode of the Data [in the] Sandbox podcast miniseries! We explain what “artificial intelligence” and “machine learning” mean in a way kids can understand, using everyday examples like TV show recommendations, robot vacuums and math homework.
Those everyday examples have made “artificial intelligence” and “machine learning” familiar concepts not just to data experts, but also to the general public. But how familiar? And to whom?
Google Trends data can show us with a bit more precision how popular these terms are, how their use has changed over time, and even…
When I think of dangerous Christmas decorations, I always think of scenes like this one from “National Lampoon’s Christmas Vacation”:
And yet the most dangerous part of holiday decorating doesn’t involve electricity.
I downloaded and analyzed the U.S. Consumer Product Safety Commission’s latest 10 years of data on injuries involving Christmas-related products. Applying a bit of data science, we’ll find where danger might lurk among the sparkly lights and shiny ornaments. You might be surprised: One dangerous item is something you probably use year-round.
There are 3,917 injuries related to Christmas products in the dataset from the CPSC’s National…
In this week’s Alter Everything podcast episode, guest Steve Mann from Alteryx partner Propel32 Analytics discusses the increasing importance of analytics in the mergers and acquisitions field in recent years. Data analysts and data scientists must constantly adapt to that kind of change, and there’s always something new to learn!
You may have heard of modeling techniques to predict the probability of churn for a customer, or to assess whether a customer will or won’t respond to an offer. But what about figuring out which customers might increase their purchasing — or could stop buying — as the result of a promotion?
Often we focus predictive analytics on modeling customer churn or a response to an offer (perhaps using logistic regression, as demonstrated in this excellent blog post). Uplift modeling takes a different tack. …
Have you ever abandoned a shopping cart in an online store and gotten a reminder email about it later? Your poor digital cart was stranded on a lonely server somewhere. But fear not, readers — we’re not abandoning you! Welcome to the second half of our introduction to market basket analysis.
In the first post, we covered some of the essential concepts behind market basket analysis, so check that out first if you’re not familiar with the basics. This post will show how to use this approach in Alteryx Designer. …
I cook green bean casserole just once a year. Although it’s kind of a culinary travesty, we still make it with Thanksgiving dinner for sentimental reasons. Its essential ingredients are green beans, canned cream of mushroom soup and — most important — so-called “french fried” onions (also from a can) sprinkled on top. All three ingredients often are grouped together in the grocery store around the holidays.
Whether you’re the kind of person who seeks out the spooky or not, guess what: You probably live near some creepy things.
To commemorate the season, we thought it would be fun to do some macabre mapping and petrifying prediction of spooky phenomena. Data science doesn’t have to be just for serious subjects! I’ll show you how I used Alteryx Designer, Python and the mapping package Folium to analyze and map these data.
To look at how spooky U.S. metro areas are, I created a (silly) Spooky Score for each area, based on the density of cemeteries and haunted places…
Data Science Journalist for @Alteryx . Data geek and former journalism professor and researcher. Writer, knitter, hiker, cyclist. Opinions mine. she / her