The Entrepreneur Forum | Startups | Entrepreneurship | Starting a Business | Motivation | Success

Question For Devs about Data Congruence, Report Generation, and Compliance

G

Guest92dX

Guest
How does data congruence or lack there of happen on a management platform that generates reports? If I work for a large company who needs reports on different facets of the business how can any of it lack congruence?

Is this a large word for: we just use excel?

Assuming the platform someone currently uses generates exportable reports, would I be able to build a platform that organizes the data and makes it "congruent"? Would I be able to "train" that platform to build different types of "congruent" data?

This is probably a silly question with an easy answer. Please help.
 

Don't like ads? Remove them while supporting the forum. Subscribe.

csalvato

Platinum Contributor
Read Millionaire Fastlane
Summit Attendee
Speedway Pass
May 5, 2014
1,377
3,886
896
34
Rocky Mountain West
How does data congruence or lack there of happen on a management platform that generates reports? If I work for a large company who needs reports on different facets of the business how can any of it lack congruence?

Is this a large word for: we just use excel?

Assuming the platform someone currently uses generates exportable reports, would I be able to build a platform that organizes the data and makes it "congruent"? Would I be able to "train" that platform to build different types of "congruent" data?

This is probably a silly question with an easy answer. Please help.
what’s the context
 
OP
OP
G

Guest92dX

Guest
Full life-cycle management for an Engineering Procurement Construction company. The end platform has the scope of Enterprise Resource Planning.
 
OP
OP
G

Guest92dX

Guest
@csalvato

Let's just call it data inconsistency as that's what I'm thinking it really is.

Do you have an answer for any of my questions? Just tell me in Computer Science terms I'll look the rest up.
 

csalvato

Platinum Contributor
Read Millionaire Fastlane
Summit Attendee
Speedway Pass
May 5, 2014
1,377
3,886
896
34
Rocky Mountain West
@csalvato

Let's just call it data inconsistency as that's what I'm thinking it really is.

Do you have an answer for any of my questions? Just tell me in Computer Science terms I'll look the rest up.
There's many reasons for data inconsistency in software platforms....regardless of if it's a management tool or some other tool. Any software that stores data is subject to several different problems that will ultimately cause data inconsistencies.

Your question is vague and doesn't have much context, so I'll do my best with a fabricated example. Hopefully this is the true problem you're facing..

The Story of Gonzo, Inc.'s Data

The first epoch
Gonzo is a company that sells eCommerce subscription boxes. When Gonzo was started, they were originally using a Wordpress site that patched together some random membership plugins and did all their order fulfillment using Excel and on paper. They did this from 2007-2008, as they built demand for the product.

Data Sources:
  1. Wordpress site
  2. Excel fulfillment spreadsheet

The second epoch

When Gonzo started to hit about 500k in annual sales, they started hiring employees. So they switched their order processing from Wordpress over to Magento. This process had better order handling, and kept some, but not all, of their fulfillment data in the system. For example, they could log when an order was fulfilled, but they weren't logging the weight or dimensions of packages, nor the exact cost of shipping, just what they charged the customer.

This lasted 2 years, until Shopify entered the scene. When they saw that Shopify was easier than Magento, they made the switch in 2010.

Data Sources:
  1. Wordpress site (legacy data)
  2. Excel fulfillment spreadsheets (now, multiple versions of the same spreadsheet)
  3. Magento site


The third epoch

Shopify was logging all fulfillment data that it sent off to the warehouse, and all the fulfillment data was now logged by their internal staff of 1 warehouse employee. After a couple of years, they are doing $5M/year in sales, and their warehouse guy was so burned out that his quit. They realized that they needed to use an external fulfillment provider, so they worked with both Amazon FBA and Shipstation.

In this Epoch, they also leveled up their advertising by advertising on Facebook, rather than relying solely on wholesaling relationships with retailers. They started logging user behavior using Mixpanel. Their new developer was logging data to better track conversions from the ad to the sale.

Somewhere along this path, they lost their Wordpress data and old Excel spreadsheets, but didn't care, because they weren't using it at the time:

Data Sources:
  1. Wordpress site (legacy data)
  2. Excel fulfillment spreadsheets (now, multiple versions of the same spreadsheet)
  3. Magento site (legacy data)
  4. Shopify sales platform
  5. Shopify fulfillment reports
  6. Shipstation reporting
  7. Amazon FBA reporting
  8. Mixpanel behavioral data
  9. Facebook Ads data

The fourth epoch

Shopify was no longer scalable for them and their types of upsells, so in 2015 they switched to a custom-created eCommerce platform made by an internally hired team of developers. They started advertising on multiple platforms, such as Facebook and AdWords, as well as investing heavily into SEO campaigns and billboard campaigns.

Data Sources:



    • Wordpress site (legacy data)
    • Excel fulfillment spreadsheets (now, multiple versions of the same spreadsheet)
    • Magento site (legacy data)
    • Shopify sales platform
    • Shopify fulfillment reports
    • Shipstation reporting
    • Amazon FBA reporting
    • Mixpanel behavioral data
    • Facebook Ads data
    • AdWords Ads data
    • SEO Data - SEMRush
    • SEO Data - SpyFu
    • SEO Data - Google Webmaster Console
    • Billboard campaign data spreadsheet
Present Day
Now, in 2018, Gonzo is ready to take on investors. They believe they have a product lineup and user experience that can scale up the company to a $500M valuation. Their new investors want to see some numbers, so they start asking questions about company performance through the years. So let's see where inconsistencies and challenges crop up.

Inherent Challenges
  1. Any self-contained system is not perfect. There are bugs that cause data loss, or inaccurate data. So any system, all on it's own, will have some degree of error. That degree of error is usually, practically unpredictable. So, the custom ecommerce software, for example, might not have been recording something properly. Or a change somewhere along the line wiped out an entire type of data. This is not uncommon.
  2. When mixing together data from two imperfect systems, the first issue compounds. For example, in the fourth epoch, there are inconsistencies within the sales data from the custom platform and Mixpanel attribution. This could be because one system lost some data, so one system has more data than the other.
  3. When mixing together data from two systems, the rules may be subtly different. For example, Shopify data from the third epoch may have one definition of a completed order; and the new custom eCommerce platform has a different definition. So when you try to marry the two data sets together, the data seems congruent, when it isn't.
  4. A different flavor of #3, sometimes data sources don't define data the way you need it defined to make sense. For example, data from Facebook comes in the form of View-conversions and click-through-conversions, but not in the form of first-click-conversions. So it takes a savvy analyst/data scientist to realize that, and marry up the data properly (if at all possible).
  5. Sometimes data between two systems overlaps in in-obvious ways. For example, in the fourth epoch, there's a total number of sales in the system. Let's say that's 100,000 sales. But when you total up the number of sales from Facebook, AdWords, SEO and billboards, you are left with 152,000 sales. How? Well, multiple ad platforms are taking credit for the same sale, because one customer used both methods. Because the data from these ad platforms is incomplete and imperfect, it's impossible to know the total from any single source.
In short, when you start to change data sources, build on them, and add more into the mix, the problems of marrying up these different kinds of data compound onto themselves. Eventually, the data a company is using turns out to be more directional than accurate or precise.
 
Last edited:
OP
OP
G

Guest92dX

Guest
Thank you so much @csalvato. Is this usually a solvable problem or am I grasping at straws?
 

csalvato

Platinum Contributor
Read Millionaire Fastlane
Summit Attendee
Speedway Pass
May 5, 2014
1,377
3,886
896
34
Rocky Mountain West
Thank you so much @csalvato. Is this usually a solvable problem or am I grasping at straws?
It's a heady and hard problem. Lots of companies in this space are looking to solve it, particularly on the sales attribution side (since that justifies greater marketing and growth spend).

It's probably solvable, but no one has very many great solutions yet.
 
OP
OP
G

Guest92dX

Guest
Thank you so much! You're the best! Stay who you are.
 

Create an account or login to comment

You must be a member in order to leave a comment

Create account

Create an account on our community. It's easy!

Log in

Already have an account? Log in here.



Don't like ads? Remove them while supporting the forum. Subscribe to become an INSIDER.

Post New Topic

Please SEARCH before posting.
Please select the BEST category.

Post new topic

Fastlane Insiders

View the forum AD FREE.
Private, unindexed content
Detailed process/execution threads
Monthly conference calls with doers
Ideas needing execution, more!

Join Fastlane Insiders.

Top Bottom