Data Mining

  • Most Topular Stories

  • Often, Existing Customers Hate Change

    Kevin Hillstrom: MineThatData
    Kevin Hillstrom
    23 Oct 2014 | 8:15 pm
    I'm frequently disappointed that some vendors believe that my clients are complete idiots who are unwilling to change, as if vendors are flexible organizations that change at the drop of a hat.There are times when change is possible. There are times when the customer refuses to let you change.I think back to my time at Nordstrom. We killed off a $36,000,000 catalog business. This business had a merchandise assortment that evolved toward a 55+, rural customer ... a customer in stark contrast to the suburban/urban retail shopper that thrived in our stores.When we killed the catalog business,…
  • Tesco boss surveys a world of trouble from Dublin to Bangkok

    Data mining News
    25 Oct 2014 | 4:16 pm
    … the sale of Dunnhumby, the data mining firm behind Clubcard, or hiving …
  • More Tornadoes Recorded Doesn’t Mean More Tornadoes Occurring

    The Numbers
    Jo Craven McGinty
    24 Oct 2014 | 11:39 am
    The number of tornadoes recorded by the National Climatic Data Center has increased in recent years, but that doesn’t mean there have been more twisters.
  • Hyperparameter search, Bayesian optimization and related topics

    natural language processing blog
    10 Oct 2014 | 10:55 am
    In terms of (importance divided-by glamour), hyperparameter (HP) search is probably pretty close to the top. We all hate finding hyperparameters. Default settings are usually good, but you're always left wondering: could I have done better? I like averaged perceptron for this reason (I believe Yoav Goldberg has also expressed this sentiment): no pesky hyperparameters.But I want to take a much broader perspective on hyperparameters. We typically think of HPs as { regularization constant, learning rate, architecture } (where "architecture" can mean something like neural network structure,…
  • DMA 2014 Kicks Off Under New Management

    Latest articles from Direct Marketing News
    Direct Marketing News
    24 Oct 2014 | 1:23 pm
    Thomas Benton and Jane Berzan will preside over an event indicative of an association serving a wider array of industry segments.
 
  • add this feed to my.Alltop

    The Numbers

  • More Tornadoes Recorded Doesn’t Mean More Tornadoes Occurring

    Jo Craven McGinty
    24 Oct 2014 | 11:39 am
    The number of tornadoes recorded by the National Climatic Data Center has increased in recent years, but that doesn’t mean there have been more twisters.
  • Talking to Your Phone, Dining Out and Slower Runs (Statshot)

    David Goldenberg
    24 Oct 2014 | 10:29 am
    American teens don’t use smartphone voice-recognition technology much more than adults do overall, but they use it in different ways. Homemade meals make up almost 20% less of our calorie intake than they did 35 years ago. The jump in marathon participation has brought with it a sharp increase in average finishing time.
  • Using Air Traffic Data to Predict Ebola’s Spread

    Jo Craven McGinty
    17 Oct 2014 | 10:32 am
    While a number of researchers are modeling the spread of Ebola in West African countries, a group at Boston’s Northeastern University has used air traffic connections to explore how the disease might spread to the rest of the world.
  • Leaving Puerto Rico, Counting Calories and a New No. 1 (Statshot)

    David Goldenberg
    17 Oct 2014 | 9:46 am
    Far more Puerto Ricans now live off the island than on it, many fast food chains have started serving slightly lighter fare, and Mississippi State took over first place in the AP football poll this week for the first time in its history.
  • Quiz: How Do Politics Relate to Shopping Habits?

    Rani Molla
    17 Oct 2014 | 4:46 am
    People's political beliefs extend into a number of areas of their lives. According to data from a market research company, these belief systems also relate to how and what people buy.
  • add this feed to my.Alltop

    natural language processing blog

  • Hyperparameter search, Bayesian optimization and related topics

    10 Oct 2014 | 10:55 am
    In terms of (importance divided-by glamour), hyperparameter (HP) search is probably pretty close to the top. We all hate finding hyperparameters. Default settings are usually good, but you're always left wondering: could I have done better? I like averaged perceptron for this reason (I believe Yoav Goldberg has also expressed this sentiment): no pesky hyperparameters.But I want to take a much broader perspective on hyperparameters. We typically think of HPs as { regularization constant, learning rate, architecture } (where "architecture" can mean something like neural network structure,…
  • Machine learning is the new algorithms

    3 Oct 2014 | 10:19 am
    When I was an undergrad, probably my favorite CS class I took was algorithms. I liked it (a) because my background was math so it was the closest match to what I knew and (b) because even though it was "theory," a lot of the stuff we learned was really relevant. Over time, it seemed like the area had distilled worthwhile algorithms from interesting-in-theory-but-you'll-never-actually use algorithms.In fact, I think this is a large part of why most undergraduate CS degrees today require a course in algorithms. You have these very nice, clearly defined statements, and very elegant solutions to…
  • AMR: Not semantics, but close (? maybe ???)

    27 Sep 2014 | 9:00 am
    Okay, necessary warning. I'm not a semanticist. I'm not even a linguist. Last time I took semantics was twelve years ago (sigh.)Like a lot of people, I've been excited about AMR (the "Abstract Meaning Representation") recently. It's hard not to get excited. Semantics is all the rage. And there are those crazy people out there who think you can cram meaning of a sentence into a !#$* vector [1], so the part of me that likes Language likes anything that has interesting structure and calls itself "Meaning." I effluviated about AMR in the context of the (awesome) SemEval panel.There is an LREC…
  • Reading group notes: point/counter-point on "predict models"

    31 Jul 2014 | 6:26 am
    In our local summer reading group, I led the discussion of two papers that appeared in Baltimore last month:Marco Baroni, Georgiana Dinu & German Kruszewski, Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. ACL 2014.Omer Levy & Yoav Goldberg., Linguistic Regularities in Sparse and Explicit Word Representations. CoNLL 2014 (best paper award recipient).I love handouts, so I made a handout for this one too. I paste below the handout. All good ideas are those of the respective authors; all errors and bad ideas are probably due to…
  • Hello, World!

    27 Jul 2014 | 7:18 am
    Okay, usually Hello World is the first program you learn to write in a new programming language. For fun, I've been collecting how to say hello world in different human languages, something remarkably difficult to search for (because of the overloading of the word "language"). I have 28. I'd like to make it to 280 :). If you have one (or more) to contribute, email me, post a comment, or tweet to me @haldaume3. And of course if you think any of these is wrong, please let me know that too. 1 bar Servus Woid! 2 ca Hola Món! 3 de Hallo Welt! 4 en Hello World! 5 eo Saluton, Mondo! 6 es ¡Hola…
 
  • add this feed to my.Alltop

    Kevin Hillstrom: MineThatData

  • Often, Existing Customers Hate Change

    Kevin Hillstrom
    23 Oct 2014 | 8:15 pm
    I'm frequently disappointed that some vendors believe that my clients are complete idiots who are unwilling to change, as if vendors are flexible organizations that change at the drop of a hat.There are times when change is possible. There are times when the customer refuses to let you change.I think back to my time at Nordstrom. We killed off a $36,000,000 catalog business. This business had a merchandise assortment that evolved toward a 55+, rural customer ... a customer in stark contrast to the suburban/urban retail shopper that thrived in our stores.When we killed the catalog business,…
  • Omnichannel = Store Closures

    Kevin Hillstrom
    22 Oct 2014 | 8:15 pm
    Did you read this little ditty (click here folks)? The article is about Pier 1 and their "omnichannel guideposts".Read the second paragraph within Guidepost #2. Heck, I'll quote it for you:"... with approximately 60% of its leases coming up within the next three to four years, the retailer can carefully evaluate real estate needs and adjust the size of its store portfolio accordingly. Each store and market will be reviewed to determine the appropriate number of stores to maximize market share and optimize profitability."The quote may as well read something like this:"Thank God 60% of our…
  • Sitting At Home: The Problem With Retail

    Kevin Hillstrom
    21 Oct 2014 | 8:15 pm
    The story of the Fall is the inability of businesses to acquire new customers, and the inability to reactivate lapsed buyers.It's a catastrophe in retail.We have spent a full decade teaching the customer that they do not have to get in a car and drive to a store. From 2000 - 2009, it was all about "being multi-channel", which was code for "make sure the website integrates with the retail store experience." Retailers dove in, head-first. Today, it's crazy to think about a retail website that provides a fundamentally different creative and merchandising experience from the store.This level of…
  • Why Can't I Reactivate Or Acquire Customers Anymore?

    Kevin Hillstrom
    20 Oct 2014 | 8:15 pm
    The theme of the fall is this:It has become really difficult to acquire new customers.It has been difficult to reactivate lapsed buyers for a couple of years now.Consequently, the customer file is being starved.If the customer file is being starved, it is going to be really hard to grow in the future.In catalog marketing, it is now clear why it has become so hard to acquire new customers.The cataloger focused on a 50 - 75 year old customer ... and has for the past decade.The co-ops spun 50 - 75 year old customers to catalogers "at scale", creating an unprecedented level of laziness and…
  • The Biggest Story Of The Fall

    Kevin Hillstrom
    19 Oct 2014 | 8:15 pm
    The biggest story of the fall, to date, is the inability of so many e-commerce, retail, and catalog businesses to reactivate customers, or to acquire new customers.It is an epidemic, folks. You keep asking me if your situation is unique.Your situation, my friends, is not unique.Catalogers, known to grumble with the best of them, are rumbling these days about the "collapse of the co-ops". I hear the questions all the time ... "The co-op business model literally forced me to use them, and now, performance is awful and nobody will help me. What happened, and how I can fix the problem?" Hint…
 
  • add this feed to my.Alltop

    TIBCO Spotfire's Trends and Outliers

  • Data and Analytics for Manufacturing Innovation

    Spotfire Blogging Team
    24 Oct 2014 | 5:55 am
    The key to sustaining a healthy and globally competitive manufacturing sector is making better use of data and analytics, according to an article on the Center for Data Innovation website. This so-called “smart manufacturing” could create $371 billion in net global value over the next four years, according to a 2014 estimate from the market intelligence firm IDC. “It could also help make U.S. manufacturers competitive in the global economy in three main ways: streamlining the design process, improving factory operations, and managing risk in the supply chain,” the article notes. For…
  • Using Big Data and Analytics to Develop Fact-Based Hypotheses

    Spotfire Blogging Team
    23 Oct 2014 | 5:55 am
    There’s little doubt that the use of big data and analytics is having a dramatic impact in helping data scientists arrive at fact-based hypotheses. Data-driven decision-making is “the practice of basing decisions on the analysis of data rather than purely on intuition,” according to an article by Tom Fawcett and Foster J. Provost, professor of information systems at New York University’s Leonard N. School of Business. But just as senior executives continue to balance data-driven decision-making with gut instinct, it’s also important for data scientists to avoid over relying on…
  • Bringing Data Visualization to the Masses

    Spotfire Blogging Team
    22 Oct 2014 | 6:05 am
    Data visualization is extremely effective in helping executives and other users to absorb information and gain new insights into business and operational trends. It opens up new ways for knowledge workers to absorb information. For instance, a study of office workers conducted by Mindlab International at The Sussex Innovation Centre found that when data is displayed more visually, workers are 17% more productive. But in order for data visualization tools to be widely adopted by organizational leaders and other knowledge workers, they need to be simple to use and easily accessible. A time…
  • Three Ways to Avoid a Big Data Bottleneck

    Spotfire Blogging Team
    21 Oct 2014 | 6:03 am
    As companies grapple with the tsunami of data coming from connected devices, mobile, and the Web, there is the potential for a big data bottleneck to block business innovation. That’s the assertion of Brian McCarthy, managing director of information and analytics strategy at Accenture Analytics, in a new Harvard Business Review blog post. He suggests that organizations take three steps to avoid the analysis paralysis that can result from embracing data-driven decision-making. First, despite the warp-speed that data may appear to be flowing through the corporate network, organizations should…
  • In Recognition of Excellence in Advanced Analytics

    Spotfire Blogging Team
    20 Oct 2014 | 5:55 am
    At TIBCO Spotfire, our mission is providing companies, non-profit organizations, government agencies, and other entities with the ability to capture the right information at the right time and act on it proactively to gain competitive advantage. Occasionally, the success that’s achieved by our clients is recognized by the industry. This past week, TIBCO Spotfire’s advanced analytics solution purpose-built for a client was honored with a 2014 Data Impact Award. The award ceremony, hosted by Cloudera, was held on October 15 in tandem with the Strata + Hadoop World Conference in New York.
  • add this feed to my.Alltop

    PolicyMap

  • Location Affordability Version 2: Better Than the Original

    Bernie Langer
    24 Oct 2014 | 7:06 am
    Brand new datasets are great. When HUD’s Location Affordability data came out last year, we couldn’t wait to add it, because of how simply it illustrated the impact of housing and transportation costs on the budgets of various household types. What’s better than new datasets? When a new dataset is so useful, its creator decides to make it better. And that just happened with Location Affordability. Soon after the original Location Affordability was released, HUD arranged a conference call of the data’s key users, which we participated in. Based on the feedback from that call, HUD made…
  • New Unbanked Data on PolicyMap!

    Kristin Crandall
    20 Oct 2014 | 2:13 pm
    Have you been to your local bank branch lately? Perhaps withdrawn money from your checking or savings account using an ATM? Many of us who have a relationship with a traditional financial institution may take it for granted, but a lot of people are without access to these institutions. Growing attention is being paid to households who are considered “unbanked,” meaning the household lacks any kind of deposit account at an insured depository institution, or “underbanked,” meaning the household has a checking and/or savings account but has also used alternative financial services (AFS)…
  • PolicyMap Wins Gold Stevie Award for Web Programming/Design

    Katie Nelson
    15 Oct 2014 | 2:30 pm
    Philadelphia, PA – 10/15/14 – PolicyMap was named the winner of a Gold Stevie® Award in the Best Web Software Programming/Design category in The 11th Annual International Business Awards today. More than 3,500 nominations from organizations of all sizes and in virtually every industry were submitted this year for consideration in a wide range of categories, including Company of the Year, Website of the Year, Best New Product or Service of the Year, Corporate Social Responsibility Program of the Year, and Executive of the Year, among others. PolicyMap won in the Best Web Software…
  • See Round I Promise Zones on PolicyMap

    Morgan Robinson
    10 Oct 2014 | 7:15 am
    We recently added areas designated as federal Promise Zones to PolicyMap. What is a Promise Zone? These areas are the first five of 20 total communities to be designated through 2015 by the Obama administration: Choctaw Nation of Oklahoma Kentucky Highlands Los Angeles (Hollywood, East Hollywood, Koreatown, Pico Union and Westlake neighborhoods) San Antonio (EastPoint neighborhood) Philadelphia (Mantua neighborhood) Designation as a Promise Zone does not entail any additional federal grants or funding; instead, HUD, USDA, HHS, DOJ, SBA, and other federal agencies will help local government…
  • PolicyMap Geocoder: Now Even More Gooder!

    Bernie Langer
    8 Oct 2014 | 11:34 am
    400 North Street, Harrisburg, PA. It’s a simple address. It’s a state office building. People work there. You can mail a letter there. But for a while, you might have had some trouble finding it on PolicyMap. A couple years ago, we upgraded our geocoder (the process that finds an address on a map) so it was much more flexible in finding addresses typed into the location bar. The new geocoder featured rooftop geocoding: It knew the precise locations of most addresses in the country. It also featured constant updates, spellchecking capabilities, and alternate street names. The old geocoder…
  • add this feed to my.Alltop

    Revolutions

  • Because it's Friday: Virtual robots learn to walk

    David Smith
    24 Oct 2014 | 1:36 pm
    For his PhD at Delft University of Technology's Faculty of Mechanical Engineering, Thomas Geijtenbeek created robots that learned how to walk. These were virtual robots — simulations in a computer system — but with realistic muscles, joints and mass that behave in real-life ways. When you see computer-generated figures move around in movies or computer games, the motion is either hand-generated (using a process not very different from that of animating old Looney Tunes cartoons), or the motion is captured from a human actor. The former method is very time consuming, and limited by…
  • Rocker: Docker containers for R

    David Smith
    24 Oct 2014 | 10:01 am
    If you haven't heard the buzz about Docker but you often need to spin up Linux-based VM's for testing, simulations, etc. then you should check it out. In short, Docker rocks: we use it for testing our Linux-based distros of Revolution R Open. If you want to use R and Docker together, Dirk Eddelbuettel and Carl Boettiger have made it easy with Rocker, and have also provided a nice explanation of Docker itself: While its use (superficially) resembles that of virtual machines, it is much more lightweight as it operates at the level of a single process (rather than an emulation of an…
  • A first look at Distributed R

    Joseph Rickert
    23 Oct 2014 | 8:30 am
    by Joseph Rickert One of the most interesting R related presentations at last week’s Strata Hadoop World Conference in New York City was the session on Distributed R by Sunil Venkayala  and Indrajit Roy, both of HP Labs. In short, Distributed R is an open source project with the end goal of running  R code in parallel on data that is distributed across multiple machines. The following figure conveys the general idea. A master node controls multiple worker nodes each of which runs multiple R processes in parallel. As I understand it, the primary use case for the Distributed R software is…
  • How the MKL speeds up Revolution R Open

    Andrie de Vries
    22 Oct 2014 | 5:43 am
    by Andrie de Vries Last week we announced the availability of Revolution R Open, an enhanced distribution of R.  One of the enhancements is the inclusion of high performance linear algebra libraries, specifically the Intel MKL. This library significantly speeds up many statistical calculations, e.g. the matrix algebra that forms the basis of many statistical algorithms. Several years ago, David Smith wrote a blog post about multithreaded R, where he explored the benefits of the MKL, in particular on Windows machines. In this post I explore whether anything has changed.  What is the MKL? To…
  • R in Production: Controlling Runtime

    Joseph Rickert
    21 Oct 2014 | 8:32 am
    by Jamie F OlsonProfessional Services Consultant, Revolution Analytics One challenge in transitioning R code into a production environment is ensuring consistency and reliability. These challenges span a wide variety of issues, but runtime characteristics are an important operational characteristic. Specifically, production code should have a consistent, predictable runtime for a particular computational infrastructure. Among other things, this makes it possible to plan and scale IT infrastructure based on operational requirements. Analytics in general and R in particular possess…
 
  • add this feed to my.Alltop

    iTrend Blog

  • WHO: Tuberculosis (TB) epidemic much worse than people think

    Annie M. Dance
    23 Oct 2014 | 2:36 pm
    TB is considered to be the world’s second most ruthless killer after HIV/AIDS. Hopes of eradicating it completely are experiencing a considerable setback: the World Health Organization (WHO) announced Wednesday that last year saw twice as many new cases appear than previously estimated.  The world’s preoccupation with the Ebola virus has eclipsed almost any attention to other health hazards. But the tuberculosis epidemic is now considered to be much more severe than before. TB is spread from person to person through the air. When people with lung TB cough, sneeze or spit, they propel the…
  • iTrend analytics may help understand Ebola

    Annie M. Dance
    18 Oct 2014 | 3:28 pm
    iTrend analytics may help understand Ebola The Ebola virus outbreak in West Africa has now claimed more than 4,000 lives. A recent BBC article, Ebola: Can big data analytics help contain its spread? says a growing number of data scientists agree that big data analytics may help to contain the virus. Big data analytics is about bringing together many different data sources and mining them to find patterns. In the digital age, tracking the movement of potentially infected people is a lot easier. iTrend’s innovative software shows real time data. A keyword search of Ebola for the past…
  • 5 new Bitcoin facts that may surprise you

    Michael Alatortsev
    10 Jul 2014 | 10:50 am
    1. Russia had previously declared Bitcoin illegal.  It has just recently softened its stance, and, judging from the prevalence of Russian language tweets in our Bitcoin data sets, the Russians are now all over the cryptocurrency.  Based on volume alone, they are now dominating #bitcoin social media conversations.   2. new cryptocurrencies are continuing to emerge; latest example is Latium – claiming to be the fist and only cryptocurrency network (no mining required). 3. Dogecoin is dead.  Wow, really. 4. Snoop Dogg‘s comment about Bitcoin remains the highest retweeted…
  • sneak preview of iTrend 2.0 #analytics – new UI, new insights

    iTrend LLC
    8 Jul 2014 | 9:26 am
    We are testing the latest version of our social analytics platform. It offers tons of new functionality: multi-language support, with ability to split social data by language global maps, with several different views improved filtering brand-new NLP capabilities (the system can understand what people are talking about) additional ways to combine social with other data sources Plus, it is: super fast more affordable than Salesforce Marketing Cloud, Sysomos, etc more flexible than any leading tool customizable (talk to us about your specific requirements today) If you are interested in…
  • Comprehensive analysis of 273,000 #AmazonCart tweets

    iTrend LLC
    23 May 2014 | 8:43 am
    May 28 2014 update: 273,000 tweets were analyzed. Updated Top Selling items are shown below. Please note: we can only track products being added to cart, we don’t have access to actual checkout transactions (unless people choose to share their purchase on Twitter upon checkout – which some do).  Not all ‘sales’ mentioned below have been taken through checkout process.   Top #AmazonCart sellers, by number of items sold: Top #AmazonCart sellers, by total sales value:   We posted some preliminary data when the new feature went live on May 5 2014.  Two weeks…
Log in