Data Mining

  • Most Topular Stories

  • Why monsanto wants to buy syngenta

    Data mining News
    22 May 2015 | 6:26 am
    … , and provide advice based on data mining. Monsanto, based in St. Louis …
  • The Economist gets in on the AI Fluff

    Data Mining: Text Mining, Visualization and Social Media
    Matthew Hurst
    10 May 2015 | 7:54 pm
    The Economist leads with an editorial and an article on The Dawn of Artificial Intelligence. The editorial starts of with: “THE development of full artificial intelligence could spell the end of the human race,” Stephen Hawking warns. Elon Musk fears that the development of artificial intelligence, or AI, may be the biggest existential threat humanity faces. Bill Gates urges people to beware of it. Dread that the abominations people create will become their masters, or their executioners, is hardly new. But voiced by a renowned cosmologist, a Silicon Valley entrepreneur and the founder of…
  • Number of the Day: 5.3 Million

    The Numbers
    Brian Hershberg
    21 May 2015 | 9:00 am
    Today's number, 5.3 million youth baseball participants, comes at the expense of our national pastime or, some might say, passed time.
  • Gliebers Dresses: The Prawn Tower

    Kevin Hillstrom: MineThatData
    Kevin Hillstrom
    21 May 2015 | 8:10 pm
    As you know, Gliebers Dresses is a fictional story about a Catalog Executive Team. If this is not your cup of tea, why not do a Google Search to learn about how Adobe is hiring actors to read white papers in podcast form (or click here, your choice).Welcome to the Gliebers Dresses Executive Conference Room.Glenn Glieber (Owner, CEO): Ah, Memorial Day weekend. The memories. I hope you all have plans. Remember, your staff are free to leave at 3:30pm today.Meredith Thompson (Chief Merchandising Officer): I already sent my team home.Glenn Glieber: It's nine in the morning.Lois Gladstone (Chief…
  • CRM Software Sales Grow 13 Percent to $23 Billion Worldwide

    Latest articles from Direct Marketing News
    Direct Marketing News
    22 May 2015 | 9:19 am
    Big players interested clients with expanded features; pure plays scored with digital and CX solutions, says Gartner.
 
  • add this feed to my.Alltop

    Data Mining: Text Mining, Visualization and Social Media

  • The Economist gets in on the AI Fluff

    Matthew Hurst
    10 May 2015 | 7:54 pm
    The Economist leads with an editorial and an article on The Dawn of Artificial Intelligence. The editorial starts of with: “THE development of full artificial intelligence could spell the end of the human race,” Stephen Hawking warns. Elon Musk fears that the development of artificial intelligence, or AI, may be the biggest existential threat humanity faces. Bill Gates urges people to beware of it. Dread that the abominations people create will become their masters, or their executioners, is hardly new. But voiced by a renowned cosmologist, a Silicon Valley entrepreneur and the founder of…
  • AI, Artificial Birds and Aeroplanes

    Matthew Hurst
    9 May 2015 | 11:50 am
    The Turing Test for artificial intelligence is a reasonably well understood idea: if, through a written form of communication, a machine can convince a human that it too is a human, then it passes the test. The elegance of this approach (which I believe is its primary attraction) is that it avoids any troublesome definition of intelligence and appeals to an innate ability in humans to detect entities which are not 'one of us'. This form of AI is the one that is generally presented in entertainment (films, novels, etc.). However, to an engineer, there are some problems with…
  • How to Understand Computers in Film

    Matthew Hurst
    6 May 2015 | 4:21 pm
    When we see an act of programming, screeds of code or other interactions with computers in movies, software engineers are likely to roll their eyes. When Chappie's coder has to write 'terabytes of code' When Ford's computer guy has to 'write a special program' to crack a password in one of the Jack Ryan movies etc. I rolled my eyes at these. But I also realize that these interactions are just symbols. They are place holders for 'someone doing some coding'. If we actually saw someone doing some coding, I think we'd roll our eyes for another reason. This…
  • How the Tech Media Keeps Artificial Intelligence at a Distance

    Matthew Hurst
    4 May 2015 | 1:30 pm
    In sympathy with yesterday's post about AI as presented in films, consider this recent article from the Wall Street Journal: Artificial Intelligence Experts are in High Demand. A list of mostly machine learning experts is produced as evidence for the topic of the article. There is an unfortunate trend being presented to the public in this space in which the term 'artificial intelligence' is being used to draw readers with stories of real technical achievements in the space of machine learning and machine perception (recognizing a cat in a image is not an act of artificial…
  • How Hollywood Keeps Artificial Intelligence at a Distance

    Matthew Hurst
    3 May 2015 | 8:40 pm
    When something doesn't exist (like artificial intelligence) it's easy to think that there is some missing piece of magic required to bring it in to existence. There has been a growing interest in movie depictions of AI of late, and these all seem to require some sort of non-linear step to realize this technology. Ex Machina (which I really enjoyed) required a new sort of hard/software in the form of a jelly like substance. Chappie (which I also liked, though I generally prefer cheese and ham combined in a sandwich) required 'terabytes of coding' and a good amount of luck…
  • add this feed to my.Alltop

    The Numbers

  • Number of the Day: 5.3 Million

    Brian Hershberg
    21 May 2015 | 9:00 am
    Today's number, 5.3 million youth baseball participants, comes at the expense of our national pastime or, some might say, passed time.
  • Time Check: Markets Fret a Leap Second

    Brian Hershberg
    19 May 2015 | 7:04 am
    As The Numbers wrote in January, the leap second is due to make one of its periodic appearances later this year. And markets are jumpy.
  • Behind The Numbers: Food for Thought

    Jo Craven McGinty
    15 May 2015 | 9:09 am
    When it comes to data, the language used to explain is often as important as the numbers. The USDA's estimates of where folks are eating is a prime example.
  • Numbers Noise: IP Addresses Near Limit

    Brian Hershberg
    13 May 2015 | 9:55 am
    The U.S. is running out of Internet Protocol addresses. But this doesn't mean the Internet has reached its limit.
  • Number of the Day: $179.4 Million

    Brian Hershberg
    12 May 2015 | 8:57 am
    Today's number, $179.4 million, comes courtesy of the auction house Christie's, which on Monday evening sold Picasso's "Women of Algiers (Version O)" to an anonymous telephone bidder.
 
  • add this feed to my.Alltop

    Kevin Hillstrom: MineThatData

  • Gliebers Dresses: The Prawn Tower

    Kevin Hillstrom
    21 May 2015 | 8:10 pm
    As you know, Gliebers Dresses is a fictional story about a Catalog Executive Team. If this is not your cup of tea, why not do a Google Search to learn about how Adobe is hiring actors to read white papers in podcast form (or click here, your choice).Welcome to the Gliebers Dresses Executive Conference Room.Glenn Glieber (Owner, CEO): Ah, Memorial Day weekend. The memories. I hope you all have plans. Remember, your staff are free to leave at 3:30pm today.Meredith Thompson (Chief Merchandising Officer): I already sent my team home.Glenn Glieber: It's nine in the morning.Lois Gladstone (Chief…
  • When One Category Works Against Most Of Your Categories

    Kevin Hillstrom
    20 May 2015 | 8:10 pm
    Sometimes, you have a merchandise category that works against the rest of the business. It happens more often than you'd think.When I worked at Nordstrom, we had Cosmetics, Mens Footwear, and Womens Footwear. These categories helped all other categories - meaning that when a customer purchased Cosmetics, the customer instantly became more likely to buy from Accessories, for instance. But the opposite wasn't necessarily true. Growing Accessories helped Accessories. Growing Cosmetics helped Cosmetics and helped Accessories.In my Nordstrom example, which category should be featured in marketing,…
  • David Letterman

    Kevin Hillstrom
    19 May 2015 | 8:11 pm
    The guy was throwing watermelons off of a building. That's the kind of thing that was interesting to a seventeen year old aspiring to become a statistician. Mr. Letterman had re-written the rules of late night television.Around the time Larry Bud Melman delivered periodic overnight comedic thrills, companies like Lands' End and L.L. Bean were re-writing the rules of cataloging. Prior to 1980, "big books" dominated the scene ... JCP, Wards, Spiegel. Five hundred pages, delivered a few times a year. Their model was upset by 124 pages mailed twelve times a year ... customers had not seen that…
  • Merchandise Category Interaction

    Kevin Hillstrom
    19 May 2015 | 8:10 pm
    You've enjoyed an annual physical, right?You learn that your cholesterol is too high (too many nachos), so you are prescribed a drug. Before the drug is prescribed, however, your doctor determines if the drug will interact with other drugs you are taking.You don't want drugs interacting negatively, do you?The same issue happens in your business. But we're not trained to measure interactions. The vendor community prescribes solutions. What if their solutions cause negative interactions within your business?I assure you, the solutions some of those folks prescribe for you yield negative and…
  • You Are A Doctor, Not A Pharmacist

    Kevin Hillstrom
    18 May 2015 | 8:10 pm
    The modern theme of marketing misses out on all of the magic and glory bubbling within your merchandising ecosystem. You are asked to market in all channels (#omnichannel), and you are asked to pay tolls (Google, Facebook, Abacus) in an effort to get customers to purchase.In other words, the industry wants to help you prescribe a drug ... the industry wants you to count pills and put the pills in a bottle and then communicate side effects (omnichannel sales gains).Your job, quite honestly, should be the opposite. You should be a doctor. You should diagnose problems, and then you, yes you,…
 
  • add this feed to my.Alltop

    TIBCO Spotfire's Trends and Outliers

  • How ZE PowerGroup Grew Market Share and Increased Customer Satisfaction

    Spotfire Blogging Team
    21 May 2015 | 5:55 am
    CHALLENGE As Waleed El-Ramly, chief product officer of ZE, explains, “To stay on top, you have to be innovative, and build new products and features all the time.” ZE clients were looking for self-service BI, and when ZE examined the build vs. partner equation, the answer was clear. “We knew it would take more developers, a long time to market, and a change in focus. So, we started to look at the market for something clients wanted.” El-Ramly says that because ZE has an insatiable desire to grow, empowering clients with the best total solution was a top priority. “We want to be in…
  • The Transformative Power of Big Data Analytics in Sports

    Spotfire Blogging Team
    20 May 2015 | 5:59 am
    It all started with Moneyball, the 2011 film detailing how general manager Billy Beane used analytics to vastly improve the fortunes of his Oakland A’s. And while the film doesn’t come with a happy ending—the A’s are eliminated on their way to the World Series—it helped reignite the fire for professional sport associations to leverage the massive amount of data they possess. Bottom line? It’s a game-changer. Here’s why. Seeing is Believing According to Forbes, a number of high-profile soccer clubs, such as Premier League’s Arsenal, are now spending big on analytics. One key…
  • Transforming Manufacturing with Predictive Analytics

    Spotfire Blogging Team
    18 May 2015 | 6:12 am
    The manufacturing industry is no stranger to the advent of Big Data. It is quite impressive that most manufacturers have been embracing disruptive technologies to stay ahead of their competition and thrive. One of the most powerful capabilities that plays a vital role in the manufacturing world is the application of predictive analytics capabilities—which allows them to move their businesses forward. Specifically, the ability to extract meaningful insights about products, processes, production, yield, maintenance, and other manufacturing functions, as well as the ability to make decisions…
  • Spotfire Tips & Tricks: Interactively draw territories on a map with a TERR data function

    Peter Shaw
    14 May 2015 | 7:22 am
    Defining territories on a map is a useful analysis and reporting method. Examples include visualizing customers near stores, or city residents near fire stations, schools, homeless shelters or hospitals. Once a collection of locations has been defined, drawing a bounding polygon in Spotfire is straightforward using a TERR data function described herein. In this example we use Census Tract locations near Boston, Massachusetts and quickly mark some of these in Spotfire using the lasso (Alt key, or option key on Mac): We use a simple TERR data function to add a new Spotfire data table and a…
  • Anatomy of a Decision: Analysts and Researchers

    Spotfire Blogging Team
    13 May 2015 | 5:55 am
    In the last article in our Anatomy of a Decision series, we examined the benefits of collaborative analytics and how they enable companies to expand beyond individual analyses and leverage the strengths of sharing and discussing insights between colleagues to reap the benefits of collective wisdom and harness diverse perspectives for multidimensional decision-making. Still, for individual users such as analysts and researchers, analytics tools require certain attributes in order for users to achieve success. Because of the nature of their work, analysts need powerful self-service tools that…
  • add this feed to my.Alltop

    PolicyMap

  • Maggie McCullough Profiled by Lincoln Institute

    Jonah Taylor
    20 May 2015 | 2:09 pm
    Data-Driven Decision Making The ability to visualize data – where residents have health insurance, how close they are to a park or library, or who is going through foreclosure – has become prerequisite in citybuilding these days. It’s almost hard to imagine making policy decisions or launching initiatives without big data as a guide. And as Maggie McCullough, founder and President of PolicyMap, made clear in a presentation at the Lincoln Institute last month, the technology is getting better all the time. Read the full profile on the Lincoln Institute blog Watch Maggie’s…
  • Mapchats – Mapping the Way to Fair Housing and Environmental Justice

    Elizabeth Nash
    20 May 2015 | 11:50 am
    PolicyMap’s popular Mapchats series continues next week when we sit down with MacArthur “Genius” Fellow John Henneberger and Charlie Duncan from the Texas Low Income Housing Information Service. We’ll discuss the pivotal role of maps in their Fair Housing and environmental justice work. Share your story of using maps in your work for a chance to discuss live on the webinar with our distinguished panelists. Panelists include: John Henneberger, Co-Founder, Texas Low Income Housing Information Service: John Henneberger received a B.A. (1976) from the University of Texas at Austin. He…
  • The Latest Demographics, Income and More Now Available on PolicyMap

    Elizabeth Nash
    14 May 2015 | 11:16 am
    While a few of us at PolicyMap were enjoying the American Community Survey Data Users Conference this week in Washington DC, our developers at 3D-L were hard at work getting the ACS data updated on PolicyMap! This year’s ACS conference highlighted some fascinating work with ACS data, from trends in marriage rates to the lifecycle of a piece of Census data. PolicyMap’s own Morgan Robinson discussed her process for developing neighborhood-level health indicators using multilevel modeling with ACS data on metropolitan area status, race, age, and income characteristics. You can find all this…
  • PolicyMap Updates Our Economy Menu

    Katie Nelson
    13 May 2015 | 12:39 pm
    At PolicyMap we think data should be fun. So we organize our menus to allow you to browse and find the data you need. And, at the same time, to discover a few new things about the places you care about. We’ve heard from users over time that our economy data has been particularly difficult to navigate, so we decided it was high-time we think through how best to present our data on jobs, industries, workforce, and employment. Read on to learn a bit more about what you can find in PolicyMap’s updated Economy menu. Jobs and Industries: Many people come to PolicyMap for the most granular…
  • PolicyMap at the 2015 Commonwealth Housing Forum!

    Elizabeth Nash
    7 May 2015 | 8:00 am
    We’re pleased to be presenting today at Pennsylvania Housing Finance Agency’s Commonwealth Housing Forum. If you’re at the conference, be sure to join us at 3:30 today for the session “Tapping into Readily Available Data to Inform and Guide Housing Initiatives.” I’ll be speaking along with Keith Wardrip, Community Development Research Manager of the Philadelphia Federal Reserve Bank. We’re glad to be in such good company and are looking forward to Keith’s discussion of the Fed’s Community Development Dashboard. The post PolicyMap at the 2015 Commonwealth Housing Forum!
  • add this feed to my.Alltop

    Revolutions

  • Revolution R Open 3.2.0 now available for download

    David Smith
    22 May 2015 | 7:00 am
    The latest update to Revolution R Open, RRO 3.2.0, is now available for download from MRAN. In addition to new features, this release tracks the version number of the underlying R engine version (so this is the release following RRO 8.0.3). Revolution R Open 3.2.0 includes: The latest R engine, R 3.2.0. This includes many improvements, including faster processing, reduced memory usage, support for bigger in-memory objects, and an improved byte compiler. Multi-threaded math processing, reducing the time for some numerical operations on multi-core systems. A focus on reproducibility, with…
  • First Day Highlights from the Extremely Large Databases Conference

    Joseph Rickert
    21 May 2015 | 9:00 am
    by Joseph Rickert The 8th XLDB (Extremely Large Databases) Conference open at Stanford on Tuesday with an outstanding program. This conference has been providing leadership in the "Big Data" world since its first workshop which was held in 2007. For example, the summary report for that year notes: "Both communities (industry and science) are moving towards parallel ... architectures on large clusters of commodity hardware, with the map/reduce paradigm as he leading processing model." but also observes that: "The map/reduce paradigm ... will likely not be the…
  • Open soure software has changed the way we do business

    David Smith
    20 May 2015 | 3:50 pm
    Earlier this month TechCrunch published an article of mine, "The Business Economics And Opportunity Of Open-Source Data Science". With this article I wanted to share how open-source software has disrupted the economics of doing business, now that data is a fundamental component of every businesses' operations. Open source projects like Hadoop and R, coupled with commodity hardware, have fundamentally changed the equation when it comes to the scale and scope of the problems that can feasibly be tackled.  If you'd like to read more on this topic, one other article I…
  • Fast parallel computing with Intel Phi coprocessors

    Joseph Rickert
    19 May 2015 | 8:30 am
    by Andrew EkstromRecovering physicist, applied mathematician and graduate student in applied Stats and systems engineering We know that R is a great system for performing statistical analysis. The price is quite nice too ;-) . As a graduate student, I need a cheap replacement for Matlab and/or Maple. Well, R can do that too. I’m running a large program that benefits from parallel processing. RRO 8.0.2 with the MKL works exceedingly well. For a project I am working on, I need to generate a really large matrix (10,000x10,000) and raise it to really high powers (like 10^17). This is part of my…
  • What's new in Revolution R Enterprise 7.4

    Bill Jacobs
    18 May 2015 | 6:18 am
    by Bill Jacobs, Director Technical Sales, Microsoft Advanced Analytics Without missing a beat, the engineers at Revolution Analytics have brought another strong release to users of Revolution R Enterprise (RRE). Just a few weeks after acquisition of Revolution Analytics by Microsoft, RRE 7.4 was released to customers on May 15 adding new capabilities, enhanced performance and security, ann faster and simpler Hadoop editions. New features in version 7.4 include: Addition of Naïve Bayes Classifiers to the ScaleR library of algorithms Optional coefficient tracking for stepwise regressions. …
 
  • add this feed to my.Alltop

    Data Science Notes

  • Halfway Book Review: The Signal and the Noise

    20 May 2015 | 9:24 am
    This post should make my librarian wife quite happy, as it is a book review.  And actually, a book that she picked out for me to read, Nate Silver's The Signal and the Noise.I'm currently only halfway through the book, but I have a lot of thoughts, and not sure I'll get around to writing a blog entry when I finish.  To summarize, the book is an explainer for why some predictions are good, and why some are bad.  The book is written from a simplistic perspective, such that even people with no background in statistics can understand the underlying concepts.Silver walks the reader…
  • Kansas Education Funding Analysis Part 1

    15 May 2015 | 1:02 pm
    Public education funding in Kansas is a huge mess.  The last ten years have seen multiple lawsuits, annual battles in the State Legislature on education spending, and massive changes in the State education funding formula.  Why is this such a big deal in Kansas?  A couple of factors.  First, the Kansas Constitution has a section that says the State must adequately fund education.  Second, the Kansas legislature is largely made up of small-government fiscal conservatives, so it is relatively difficult to increase public spending for anything.  Twice in the last…
  • My Top R Libraries

    12 May 2015 | 9:49 am
    A couple of weeks ago I posted a list of my top five data science software tools, which received quite a few pageviews and shares across the internet.  As someone told me, people just freaking love numbered lists.Later that week I saw Kirk Borne post on twitter regarding the top downloaded R packages, which was interesting, but a bit predictable.  The top packages contained elements that would be relevant across many fields that use R.  Packages like plyr, which has a lot of handy data-handling tools, and ggplot2 which is used to plot data.  This list was interesting, but…
  • Have you tried logarithms?

    8 May 2015 | 10:03 am
    So, cue XKCD reference.  Randall Munroe is making fun of varying levels of technical knowledge and rigor in different fields.  It's a funny cartoon, and hopefully not too offensive to any readers of this blog involved with Sociology or Literary Criticism (I doubt there are many).The irony here, is in the first panel.  In this case playing off a seemingly ignorant question of "Have you tried Logarithms?"In analytics, I actually say "have you tried logarithms?" quite a bit.  The reason is simple: to emulate different shapes of relationships that occur in nature, sometimes…
  • Modeling Fitness Tracking and Messing Up Models

    6 May 2015 | 9:21 am
    I've never met a metric that I couldn't screw up in some way after measuring, modeling, and focusing on it. This happens to me quite a bit at work, where someone will say "hey, we really need to work on our 'X' ratio."  Then everyone works on the X ratio for a few weeks which makes X go up, predictably.  This creates a couple of problems, though:X gets better for not-normal, and often not-model-able reasons (maybe not a concern for the business, but certainly irritating to me).X gets better sometimes at the expense of the business.  I should probably blog on this later,…
  • add this feed to my.Alltop

    Data Science Notes

  • Halfway Book Review: The Signal and the Noise

    20 May 2015 | 9:24 am
    This post should make my librarian wife quite happy, as it is a book review.  And actually, a book that she picked out for me to read, Nate Silver's The Signal and the Noise.I'm currently only halfway through the book, but I have a lot of thoughts, and not sure I'll get around to writing a blog entry when I finish.  To summarize, the book is an explainer for why some predictions are good, and why some are bad.  The book is written from a simplistic perspective, such that even people with no background in statistics can understand the underlying concepts.Silver walks the reader…
  • Kansas Education Funding Analysis Part 1

    15 May 2015 | 1:02 pm
    Public education funding in Kansas is a huge mess.  The last ten years have seen multiple lawsuits, annual battles in the State Legislature on education spending, and massive changes in the State education funding formula.  Why is this such a big deal in Kansas?  A couple of factors.  First, the Kansas Constitution has a section that says the State must adequately fund education.  Second, the Kansas legislature is largely made up of small-government fiscal conservatives, so it is relatively difficult to increase public spending for anything.  Twice in the last…
  • My Top R Libraries

    12 May 2015 | 9:49 am
    A couple of weeks ago I posted a list of my top five data science software tools, which received quite a few pageviews and shares across the internet.  As someone told me, people just freaking love numbered lists.Later that week I saw Kirk Borne post on twitter regarding the top downloaded R packages, which was interesting, but a bit predictable.  The top packages contained elements that would be relevant across many fields that use R.  Packages like plyr, which has a lot of handy data-handling tools, and ggplot2 which is used to plot data.  This list was interesting, but…
  • Have you tried logarithms?

    8 May 2015 | 10:03 am
    So, cue XKCD reference.  Randall Munroe is making fun of varying levels of technical knowledge and rigor in different fields.  It's a funny cartoon, and hopefully not too offensive to any readers of this blog involved with Sociology or Literary Criticism (I doubt there are many).The irony here, is in the first panel.  In this case playing off a seemingly ignorant question of "Have you tried Logarithms?"In analytics, I actually say "have you tried logarithms?" quite a bit.  The reason is simple: to emulate different shapes of relationships that occur in nature, sometimes…
  • Modeling Fitness Tracking and Messing Up Models

    6 May 2015 | 9:21 am
    I've never met a metric that I couldn't screw up in some way after measuring, modeling, and focusing on it. This happens to me quite a bit at work, where someone will say "hey, we really need to work on our 'X' ratio."  Then everyone works on the X ratio for a few weeks which makes X go up, predictably.  This creates a couple of problems, though:X gets better for not-normal, and often not-model-able reasons (maybe not a concern for the business, but certainly irritating to me).X gets better sometimes at the expense of the business.  I should probably blog on this later,…
Log in