Data Mining

  • Most Topular Stories

  • Gartner: Analytics Is Top Business Priority

    Data mining News
    30 Oct 2014 | 2:47 pm
    … specialized computing/database environments, to data mining and intelligent algorithms.
  • More Tornadoes Recorded Doesn’t Mean More Tornadoes Occurring

    The Numbers
    Jo Craven McGinty
    24 Oct 2014 | 11:39 am
    The number of tornadoes recorded by the National Climatic Data Center has increased in recent years, but that doesn’t mean there have been more twisters.
  • Hyperparameter search, Bayesian optimization and related topics

    natural language processing blog
    10 Oct 2014 | 10:55 am
    In terms of (importance divided-by glamour), hyperparameter (HP) search is probably pretty close to the top. We all hate finding hyperparameters. Default settings are usually good, but you're always left wondering: could I have done better? I like averaged perceptron for this reason (I believe Yoav Goldberg has also expressed this sentiment): no pesky hyperparameters.But I want to take a much broader perspective on hyperparameters. We typically think of HPs as { regularization constant, learning rate, architecture } (where "architecture" can mean something like neural network structure,…
  • Retail Inflection Point

    Kevin Hillstrom: MineThatData
    Kevin Hillstrom
    29 Oct 2014 | 8:15 pm
    If you want to know how important your store is to your "omnichannel mix", perform this very simple analysis:Segment Annual Demand (Retail, Website+Phone+Mobile) by Store Distance.Calculate the Percentage of Demand Within Store Distance Band Attributed to Stores.Here's an example:0 to 5 Miles = 77% Retail.6 to 10 Miles = 62% Retail.11 to 15 Miles = 51% Retail.16 to 25 Miles = 46% Retail.26 to 50 Miles = 40% Retail.51 to 75 Miles = 30% Retail.76 to 100 Miles = 25% Retail.101 to 150 Miles = 22% Retail.151+ Miles = 20% Retail.Here, the inflection point is at 16-25 miles from a store. That's…
  • MySpace Owner Names Ex-Ford Marketer as CMO

    Latest articles from Direct Marketing News
    Direct Marketing News
    30 Oct 2014 | 11:46 am
    Jon Schulz will lead marketing at Interactive Media Holdings, which seeks to advance its reputation as a digital solutions provider.
 
  • add this feed to my.Alltop

    The Numbers

  • More Tornadoes Recorded Doesn’t Mean More Tornadoes Occurring

    Jo Craven McGinty
    24 Oct 2014 | 11:39 am
    The number of tornadoes recorded by the National Climatic Data Center has increased in recent years, but that doesn’t mean there have been more twisters.
  • Talking to Your Phone, Dining Out and Slower Runs (Statshot)

    David Goldenberg
    24 Oct 2014 | 10:29 am
    American teens don’t use smartphone voice-recognition technology much more than adults do overall, but they use it in different ways. Homemade meals make up almost 20% less of our calorie intake than they did 35 years ago. The jump in marathon participation has brought with it a sharp increase in average finishing time.
  • Using Air Traffic Data to Predict Ebola’s Spread

    Jo Craven McGinty
    17 Oct 2014 | 10:32 am
    While a number of researchers are modeling the spread of Ebola in West African countries, a group at Boston’s Northeastern University has used air traffic connections to explore how the disease might spread to the rest of the world.
  • Leaving Puerto Rico, Counting Calories and a New No. 1 (Statshot)

    David Goldenberg
    17 Oct 2014 | 9:46 am
    Far more Puerto Ricans now live off the island than on it, many fast food chains have started serving slightly lighter fare, and Mississippi State took over first place in the AP football poll this week for the first time in its history.
  • Quiz: How Do Politics Relate to Shopping Habits?

    Rani Molla
    17 Oct 2014 | 4:46 am
    People's political beliefs extend into a number of areas of their lives. According to data from a market research company, these belief systems also relate to how and what people buy.
  • add this feed to my.Alltop

    natural language processing blog

  • Hyperparameter search, Bayesian optimization and related topics

    10 Oct 2014 | 10:55 am
    In terms of (importance divided-by glamour), hyperparameter (HP) search is probably pretty close to the top. We all hate finding hyperparameters. Default settings are usually good, but you're always left wondering: could I have done better? I like averaged perceptron for this reason (I believe Yoav Goldberg has also expressed this sentiment): no pesky hyperparameters.But I want to take a much broader perspective on hyperparameters. We typically think of HPs as { regularization constant, learning rate, architecture } (where "architecture" can mean something like neural network structure,…
  • Machine learning is the new algorithms

    3 Oct 2014 | 10:19 am
    When I was an undergrad, probably my favorite CS class I took was algorithms. I liked it (a) because my background was math so it was the closest match to what I knew and (b) because even though it was "theory," a lot of the stuff we learned was really relevant. Over time, it seemed like the area had distilled worthwhile algorithms from interesting-in-theory-but-you'll-never-actually use algorithms.In fact, I think this is a large part of why most undergraduate CS degrees today require a course in algorithms. You have these very nice, clearly defined statements, and very elegant solutions to…
  • AMR: Not semantics, but close (? maybe ???)

    27 Sep 2014 | 9:00 am
    Okay, necessary warning. I'm not a semanticist. I'm not even a linguist. Last time I took semantics was twelve years ago (sigh.)Like a lot of people, I've been excited about AMR (the "Abstract Meaning Representation") recently. It's hard not to get excited. Semantics is all the rage. And there are those crazy people out there who think you can cram meaning of a sentence into a !#$* vector [1], so the part of me that likes Language likes anything that has interesting structure and calls itself "Meaning." I effluviated about AMR in the context of the (awesome) SemEval panel.There is an LREC…
  • Reading group notes: point/counter-point on "predict models"

    31 Jul 2014 | 6:26 am
    In our local summer reading group, I led the discussion of two papers that appeared in Baltimore last month:Marco Baroni, Georgiana Dinu & German Kruszewski, Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. ACL 2014.Omer Levy & Yoav Goldberg., Linguistic Regularities in Sparse and Explicit Word Representations. CoNLL 2014 (best paper award recipient).I love handouts, so I made a handout for this one too. I paste below the handout. All good ideas are those of the respective authors; all errors and bad ideas are probably due to…
  • Hello, World!

    27 Jul 2014 | 7:18 am
    Okay, usually Hello World is the first program you learn to write in a new programming language. For fun, I've been collecting how to say hello world in different human languages, something remarkably difficult to search for (because of the overloading of the word "language"). I have 28. I'd like to make it to 280 :). If you have one (or more) to contribute, email me, post a comment, or tweet to me @haldaume3. And of course if you think any of these is wrong, please let me know that too. 1 bar Servus Woid! 2 ca Hola Món! 3 de Hallo Welt! 4 en Hello World! 5 eo Saluton, Mondo! 6 es ¡Hola…
 
  • add this feed to my.Alltop

    Kevin Hillstrom: MineThatData

  • Retail Inflection Point

    Kevin Hillstrom
    29 Oct 2014 | 8:15 pm
    If you want to know how important your store is to your "omnichannel mix", perform this very simple analysis:Segment Annual Demand (Retail, Website+Phone+Mobile) by Store Distance.Calculate the Percentage of Demand Within Store Distance Band Attributed to Stores.Here's an example:0 to 5 Miles = 77% Retail.6 to 10 Miles = 62% Retail.11 to 15 Miles = 51% Retail.16 to 25 Miles = 46% Retail.26 to 50 Miles = 40% Retail.51 to 75 Miles = 30% Retail.76 to 100 Miles = 25% Retail.101 to 150 Miles = 22% Retail.151+ Miles = 20% Retail.Here, the inflection point is at 16-25 miles from a store. That's…
  • Macy's Allegedly Drives $6 In-Store Demand Per $1 Of Search Spend

    Kevin Hillstrom
    28 Oct 2014 | 8:15 pm
    Yup, click here folks.We learned this at Nordstrom, way back in 2004-2005 (hint - that's what happens when you have a good database and staff dedicated to measuring store/web dynamics) ... technically, we learned that we drove at least as much volume in-store with search as we drove online. Once you learn that, you invest your money differently. In fact, you can kill a catalog division and not lose sales once you know a fact like that.Now, this is assuming that the article reflects reality. There are MANY reason to think that the article is biased.The theme of the article shifts from Macy's…
  • How Do I Know I Have A Lapsed Buyer / Reactivation Problem?

    Kevin Hillstrom
    27 Oct 2014 | 8:15 pm
    Here's one of two queries I like to run to identify customer reactivation / lapsed buyer problems.Step 1 = Identify all customers who purchased in September 2014.Step 2 = For all customers who purchased in September 2014, count the number who are not first-time buyers, and who had not purchased in the twelve months from September 2013 - August 2014. This is the number of customers who are "reactivated".Step 3 = Re-run this query, shifting all dates back exactly one month. Count the number of reactivated buyers.Step 4 = After running this query, going back in time several years, calculate the…
  • Grumbling About Amazon

    Kevin Hillstrom
    26 Oct 2014 | 8:15 pm
    In Madison, about 80,000 fans pack the stadium (students pack it a bit after the 11:00am starting time, but whatever), paying a lot of money to attend a game that is being freely televised across the country.Oh, I know, you're going to nitpick this, telling me it is only available on certain cable systems or satellite providers. Fine, point taken.Have you ever looked at what it costs to purchase football tickets? Click here, it's an expensive proposition. You have to pay a "contribution fee" that is several hundred dollars, just to earn the right to purchase season tickets. That's like paying…
  • Often, Existing Customers Hate Change

    Kevin Hillstrom
    23 Oct 2014 | 8:15 pm
    I'm frequently disappointed that some vendors believe that my clients are complete idiots who are unwilling to change, as if vendors are flexible organizations that change at the drop of a hat.There are times when change is possible. There are times when the customer refuses to let you change.I think back to my time at Nordstrom. We killed off a $36,000,000 catalog business. This business had a merchandise assortment that evolved toward a 55+, rural customer ... a customer in stark contrast to the suburban/urban retail shopper that thrived in our stores.When we killed the catalog business,…
 
  • add this feed to my.Alltop

    TIBCO Spotfire's Trends and Outliers

  • Taking Data-Driven Campaigning to the Polls

    Spotfire Blogging Team
    30 Oct 2014 | 5:55 am
    Political campaign consultants have been using public and voter data for decades. With Election Day approaching in the US on November 4th, political candidates and their campaign teams are becoming more sophisticated – if not more competitive – with their use of analytics and voter data as they seek to encourage the most desirable voters to come out to the polls and to identify potential voters who will support their candidates. The battle for voters using big data and analytics has reached a new tipping point. Investor George Soros recently donated $2.5 million to the Democratic Party…
  • How to Reimagine your Business with Analytics

    Spotfire Blogging Team
    29 Oct 2014 | 5:55 am
    One of the greatest benefits that data and analytics provides to organizational leaders is the opportunity to revolutionize existing business models and launch entirely new business lines. Historically, companies that make products, like consumer-packaged goods, have invested heavily in R&D to produce new and improved offerings. While that approach can yield new products and revenue streams, it’s not a very efficient model. Businesses today are increasingly leveraging analytics and data—including customer sentiment, transactional data, market data, and other types of data—to…
  • How Internet of Things Big Data is Driving the “Fourth Industrial Revolution”

    Spotfire Blogging Team
    28 Oct 2014 | 5:55 am
    Interconnected manufacturing systems and devices powered by the Internet of Things (IoT) is continuing to automate a broad swath of manufacturing activities, linking wired and wireless networks in the development of products to enable smart manufacturing where processes govern themselves and smart systems take corrective action when needed, according to TechRepublic. As McKinsey & Company’s Markus Loffler notes in a discussion on the topic, the emergence of IoT could drive the fourth industrial revolution following the steam engine, the conveyor belt, and the first phase of IT…
  • A Predictive Analytics Primer for Managers

    Spotfire Blogging Team
    27 Oct 2014 | 5:55 am
    For companies to benefit from mining to find the business insight gems in big data, they certainly need data scientists and analysts, but it is also vital for managers to be well-versed in predictive analytics to bolster the bottom line. Many managers might be reluctant to delve into the world of predictive analytics because of the perceived “quantitative wizardry” it takes to cull through vast amounts of data to predict customer behavior, market shifts, or other factors to gain a competitive advantage. But, many managers already are immersed in forms of predictive analytics without…
  • Data and Analytics for Manufacturing Innovation

    Spotfire Blogging Team
    24 Oct 2014 | 5:55 am
    The key to sustaining a healthy and globally competitive manufacturing sector is making better use of data and analytics, according to an article on the Center for Data Innovation website. This so-called “smart manufacturing” could create $371 billion in net global value over the next four years, according to a 2014 estimate from the market intelligence firm IDC. “It could also help make U.S. manufacturers competitive in the global economy in three main ways: streamlining the design process, improving factory operations, and managing risk in the supply chain,” the article notes. For…
  • add this feed to my.Alltop

    PolicyMap

  • How do you count a No-Stat address, anyway?

    Morgan Robinson
    30 Oct 2014 | 12:57 pm
    PolicyMap’s postal vacancy data from Valassis Lists has three different measures of vacancy that stem from how the USPS carriers track addresses. The most common type of vacancy is non-seasonal; this is a home or business that’s expected to be occupied year-round. A property can also be seasonally vacant, if it is a vacation home or a business that only operates for part of the year, such as a ski lodge or frozen custard stand. The third category is No-stat. These addresses aren’t actually counted as “vacant,” so what are they, how did they get into the data, and…
  • Location Affordability Version 2: Better Than the Original

    Bernie Langer
    24 Oct 2014 | 7:06 am
    Brand new datasets are great. When HUD’s Location Affordability data came out last year, we couldn’t wait to add it, because of how simply it illustrated the impact of housing and transportation costs on the budgets of various household types. What’s better than new datasets? When a new dataset is so useful, its creator decides to make it better. And that just happened with Location Affordability. Soon after the original Location Affordability was released, HUD arranged a conference call of the data’s key users, which we participated in. Based on the feedback from that call, HUD made…
  • New Unbanked Data on PolicyMap!

    Kristin Crandall
    20 Oct 2014 | 2:13 pm
    Have you been to your local bank branch lately? Perhaps withdrawn money from your checking or savings account using an ATM? Many of us who have a relationship with a traditional financial institution may take it for granted, but a lot of people are without access to these institutions. Growing attention is being paid to households who are considered “unbanked,” meaning the household lacks any kind of deposit account at an insured depository institution, or “underbanked,” meaning the household has a checking and/or savings account but has also used alternative financial services (AFS)…
  • PolicyMap Wins Gold Stevie Award for Web Programming/Design

    Katie Nelson
    15 Oct 2014 | 2:30 pm
    Philadelphia, PA – 10/15/14 – PolicyMap was named the winner of a Gold Stevie® Award in the Best Web Software Programming/Design category in The 11th Annual International Business Awards today. More than 3,500 nominations from organizations of all sizes and in virtually every industry were submitted this year for consideration in a wide range of categories, including Company of the Year, Website of the Year, Best New Product or Service of the Year, Corporate Social Responsibility Program of the Year, and Executive of the Year, among others. PolicyMap won in the Best Web Software…
  • See Round I Promise Zones on PolicyMap

    Morgan Robinson
    10 Oct 2014 | 7:15 am
    We recently added areas designated as federal Promise Zones to PolicyMap. What is a Promise Zone? These areas are the first five of 20 total communities to be designated through 2015 by the Obama administration: Choctaw Nation of Oklahoma Kentucky Highlands Los Angeles (Hollywood, East Hollywood, Koreatown, Pico Union and Westlake neighborhoods) San Antonio (EastPoint neighborhood) Philadelphia (Mantua neighborhood) Designation as a Promise Zone does not entail any additional federal grants or funding; instead, HUD, USDA, HHS, DOJ, SBA, and other federal agencies will help local government…
  • add this feed to my.Alltop

    Revolutions

  • Some R Highlights from the Bay Area Data Science Camp and Unconference

    Joseph Rickert
    30 Oct 2014 | 8:30 am
    by Joseph Rickert The San Francisco Bay Area Chapter of the Association of Computing Machinery (ACM) has been holding an annual Data Mining Camp and "unconference" since 2009. This year, to reflect the times, the group held a Data Science Camp and unconference, and we at Revolution Analytics were, once again, very happy to be a sponsor for the event and pleased to be able to participate.  In an ACM unconference, except for prearranged tutorials and the keynote address, there are no scheduled talks. Instead, anyone with the passion to speak gets two minutes to pitch a session.  A…
  • Integrate R into applications with DeployR Open

    David Smith
    29 Oct 2014 | 2:22 pm
    If you ever find you need to embed the results of R functions — data, charts, or even a single calculation — into other applications, then you might want to take a look at DeployR Open. DeployR Open is an open-source server-based framework for R, that makes it easy to call out to the server to run R code in real time. The workflow is simple: An R programmer develops an R script (using their standard R tools) and publishes that script to the DeployR server. Once published, R scripts can be executed by any authorized application using the DeployR API. We provide native client libraries…
  • Type III tests and R

    Joseph Rickert
    28 Oct 2014 | 8:30 am
    by Terry M. Therneau Ph.D.Faculty, Mayo Clinic About a year ago there was a query about how to do "type 3" tests for a Cox model on the R help list, which someone wanted because SAS does it. The SAS addition looked suspicious to me, but as the author of the survival package I thought I should understand the issue more deeply. It took far longer than I expected but has been illuminating. First off, what exactly is this 'type 3' computation of which SAS so deeply enamored? Imagine that we are dealing with a data set that has interactions. In my field of biomedical statistics…
  • Create Fashion Fingerprints with R

    David Smith
    27 Oct 2014 | 2:35 pm
    How do you summarize fashion? For New York Fashion Week, the New York Times used the idea of "Fashion Fingerprints", distilling a designer's collections into small fragments highlighting the palette. Here's what Marc Jacobs' current collection looks like: Click through for an interactive version where you can explore each design, and scroll down to the bottom where you can see even greater distillation: each designer represented as abstract color blocks, with colors represented from head to toe. R user Giuseppe Paleologo noted that R is ideally suited to a task like…
  • Because it's Friday: Virtual robots learn to walk

    David Smith
    24 Oct 2014 | 1:36 pm
    For his PhD at Delft University of Technology's Faculty of Mechanical Engineering, Thomas Geijtenbeek created robots that learned how to walk. These were virtual robots — simulations in a computer system — but with realistic muscles, joints and mass that behave in real-life ways. When you see computer-generated figures move around in movies or computer games, the motion is either hand-generated (using a process not very different from that of animating old Looney Tunes cartoons), or the motion is captured from a human actor. The former method is very time consuming, and limited by…
 
  • add this feed to my.Alltop

    iTrend Blog

  • WHO: Tuberculosis (TB) epidemic much worse than people think

    Annie M. Dance
    23 Oct 2014 | 2:36 pm
    TB is considered to be the world’s second most ruthless killer after HIV/AIDS. Hopes of eradicating it completely are experiencing a considerable setback: the World Health Organization (WHO) announced Wednesday that last year saw twice as many new cases appear than previously estimated.  The world’s preoccupation with the Ebola virus has eclipsed almost any attention to other health hazards. But the tuberculosis epidemic is now considered to be much more severe than before. TB is spread from person to person through the air. When people with lung TB cough, sneeze or spit, they propel the…
  • iTrend analytics may help understand Ebola

    Annie M. Dance
    18 Oct 2014 | 3:28 pm
    iTrend analytics may help understand Ebola The Ebola virus outbreak in West Africa has now claimed more than 4,000 lives. A recent BBC article, Ebola: Can big data analytics help contain its spread? says a growing number of data scientists agree that big data analytics may help to contain the virus. Big data analytics is about bringing together many different data sources and mining them to find patterns. In the digital age, tracking the movement of potentially infected people is a lot easier. iTrend’s innovative software shows real time data. A keyword search of Ebola for the past…
  • 5 new Bitcoin facts that may surprise you

    Michael Alatortsev
    10 Jul 2014 | 10:50 am
    1. Russia had previously declared Bitcoin illegal.  It has just recently softened its stance, and, judging from the prevalence of Russian language tweets in our Bitcoin data sets, the Russians are now all over the cryptocurrency.  Based on volume alone, they are now dominating #bitcoin social media conversations.   2. new cryptocurrencies are continuing to emerge; latest example is Latium – claiming to be the fist and only cryptocurrency network (no mining required). 3. Dogecoin is dead.  Wow, really. 4. Snoop Dogg‘s comment about Bitcoin remains the highest retweeted…
  • sneak preview of iTrend 2.0 #analytics – new UI, new insights

    iTrend LLC
    8 Jul 2014 | 9:26 am
    We are testing the latest version of our social analytics platform. It offers tons of new functionality: multi-language support, with ability to split social data by language global maps, with several different views improved filtering brand-new NLP capabilities (the system can understand what people are talking about) additional ways to combine social with other data sources Plus, it is: super fast more affordable than Salesforce Marketing Cloud, Sysomos, etc more flexible than any leading tool customizable (talk to us about your specific requirements today) If you are interested in…
  • Comprehensive analysis of 273,000 #AmazonCart tweets

    iTrend LLC
    23 May 2014 | 8:43 am
    May 28 2014 update: 273,000 tweets were analyzed. Updated Top Selling items are shown below. Please note: we can only track products being added to cart, we don’t have access to actual checkout transactions (unless people choose to share their purchase on Twitter upon checkout – which some do).  Not all ‘sales’ mentioned below have been taken through checkout process.   Top #AmazonCart sellers, by number of items sold: Top #AmazonCart sellers, by total sales value:   We posted some preliminary data when the new feature went live on May 5 2014.  Two weeks…
Log in