Big data is an expansive umbrella with startups of all stripes squatting under it. Even as the most successful and powerful data miners of the modern web are undoubtedly the dominant consumer platforms -- Google, Facebook, Apple and Amazon in the West, and China's WeChat in Asia -- whose vast digital empires yield them both quantity and quality of data to use as they please.
Yet these tech giants aren't generally in the business of sharing their data holdings to help others -- unless you want to pay them to target digital advertising on your behalf.
Which is where Atomico-backed startup Teralytics spies its own big data opportunity. It's built a platform selling analytics services to customers such as government agencies and transport companies that want to understand complex problems relating to human mobility -- so analyzing things like transport pressure points, or considering the optimum location for a new road, or even monitoring urban air quality without the need to deploy CO2 sensors.
The European startup has been building its analytics platform for around four years at this stage, working in semi stealth up to now to put its core tech in place, while also delivering projects with early partners and customers, such as the air quality monitoring example cited above.
The original idea for the business was spun out of ETH Zurich university, sparked by a conversation one of the co-founders had with a local telco which was looking for help analyzing commuter data on behalf of the government.
Co-founder Georg Polzer says he and his co-founders ended up sleeping in the company's data center as they worked to write code to come up with the answer to the problem -- though he notes he's past the point of personally pulling coding all-nighters himself now.
"Back at ETH we were doing a lot of data analysis and by living in Zurich, working in Zurich we got exposure to a Swiss telecom company," he tells TechCrunch. "I got into a conversation with a person there, who mentioned the Swiss government wants to understand how long people travel across the country, to leave their home and reach their destination during the day, and that person at Swisscom said can you help us? And we said sure, we can do it. And with this project we started the company."
Teraltyics has raised around $44 million to date, telling TechCrunch it's taken investment up to a Series C level, and amassing a team of 65 people working out of headquarters in Zurich and offices in New York and Singapore. Along with Niklas Zennström's Atomico fund, investors include Swiss VC firm Lakestar and Hong Kong-based Horizons Ventures.
Its current go-to-market proposition is focused on analyzing human mobility and behavior to meet the changing needs of urban planners and transport providers -- sitting under another techie umbrella: smart cities.
"Big cases that we've either worked on or are working on include things like how can a large transit network operator minimize the amount of money they spend on operating it whilst at the same time providing a better service to citizens," says incoming CEO Alastair MacLeod, who brings a background in telcos to help with the startup's next steps ramping up commercialization of its platform. "We also work with long distance operators on what should their capital plans look like for the next ten years.
"But it's all centered around... the large topics of what's happening to urban environments and what's happening to transportation as new modes of transport come on stream. It's just a huge and mushrooming area and we play directly into that space, helping inform the current providers on how they could do better. But also helping them think about what comes next."
Polzer points to disruption already happening in the transport space as ride-hailing providers like Uber push into cities and station-free bike sharing startups like Ofo proliferate. While also noting larger changes looming -- such as electric and autonomous cars -- which promise to even more radically reshaping urban infrastructure in the coming years.
Cities will need powerful analytics tools to understand and response to these changes, he argues.
"There's a whole technology shift happening in how we move around, and how we organize cities, and this shift needs to be understood and modeled and designed right -- and the right decisions need to be taken," says Polzer. "And I think we really can play that key role to help shape this wave of change."
Every single mayor I talk to says can you please help me understand the effect of Uber and Lyft on my city.
"Every single mayor I talk to says can you please help me understand the effect of Uber and Lyft on my city," he adds. "These are questions they have. It's very, very much on their minds."
So -- to the really big question -- where is Teralytics sourcing the data that powers its platform? How is it able to track city dwellers' movements in such detail and link them to highly specific behaviors?
In the first instance it's partnering with telcos whose mobile subscriber bases offer a large, rich, reliable and representative source of population data to be mined for insights, says Polzer, while also looking at bolting on additional data sources as it moves forward (integrating wi-fi network data is something it's currently working on, for example).
But the really big data crutch here is definitely telcos -- who, after tech's platform giants, hold some of the richest and most detailed data around. Even as they also typically have more stringent regulatory strictures (vs Internet businesses) on what they can do with customers' sensitive personal data.
And with very good reason -- given they provide access to connectivity, not just individual apps and services, affording them a highly intimate overview of their users' lives.
"The great thing about operator data is that usually in the market there are three to four operators, which always have at least 10 to 15 per cent marketshare. And if you look at other data sources, there's just no other data sources with that breadth across the population," says Polzer, discussing the advantage of attaching a big data business to carriers' heavily loaded pipes.
"Also telcos, are debiased among the population; it's very nicely distributed -- this means you have rich people, poor people, young people, old people. Which makes the extrapolation much much more reliable vs if you just get data from one smartphone app which is used by teenagers in certain areas. So this data is very nicely balanced and therefore can be extrapolated out to the whole population."
He also talks up the resilience of relying on telcos for the core data-set -- given that major network operators are not likely to vanish overnight. Whereas data plays that rely on an app source, for example, might be more vulnerable to passing fads taking them out of business and cutting off the flow of behavioral intel.
Plus he argues that the national constraints of operators help bolster Teralytics against shifts in individual partners' business decisions -- by positioning the business to have additional potential data providers standing by as the nature of the telecoms market necessitates it working with "many operators across different markets".
The startup has worked with eight different telcos in total up to now, says MacLeod, and has three "active discussions" in new markets, while also flagging a recently signed partnership with Three Hong Kong. Current customers include governments, transportation operators and companies in Germany, Singapore and the U.S. (It's not disclosing all its carrier partners by name but -- for the record -- says it's not currently working with TechCrunch's parent Oath's parent Verizon.)
It uses machine learning algorithms to extrapolate insights from its carrier partners' data-sets -- with key data boiling down to location information based on cell tower pings (and wi-fi data incoming), combined with clickstream data from mobile devices, which mean it can derive more granular insights by triangulating which app/website is being used at a given location/velocity -- so for example, Teralytics' platform could identify not just that a group of people are traveling around a city in cars but that they're traveling in ride-share vehicles.
"The nature of the data is you get a lot of data points per person per day," says Polzer. "For example, in comparison to app SDK data, you might see a person once or twice a day when that person opens the app. While we see that person, guaranteed, around 150 times a day. When you them look into the use-case we tackle -- which is mobility, understanding how humans move around, which routes they take, which mode of transport they take -- you need to have that path, that journey of a person. And we believe the only data that really provides that is telecom data."
"A lot of the reasons why operators work with us is because we exactly have developed an ability to, we call it, extrapolate -- so from one sub-set of the population we extrapolate out to the whole population," he adds.
"You don't necessarily know this individual person did that individual thing, but when you're talking about it in terms of groups -- which we do anyway for privacy reasons -- you can infer patterns of behavior around how many did this sort of thing, or how many took a ride-share, which we may or may not identify by an individual brand, vs how many took some other mode of transport. But it all effectively comes from different types of data being overlaid in a fairly sophisticated machine learning engine," says MacLeod.
Balancing privacy concerns is clearly going to be a critical consideration for the success of the venture -- which needs telcos to buy in to pump in the big data fuel, and therefore also needs their customers be comfortable with the idea that their personal data -- i.e. information about where they go and what they do online -- might be being shared with, for example, government agencies.
So even if you start from the premise of carrier data being anonymized, as Teralytics says is the case here, a system could be built that tracks an unnamed user's location and displays a trace from a street address to a commercial address and back again twice a day, for example, and the person looking at that data might easily infer they're seeing a person's home and workplace -- and then it's potentially very easy to re-identify that individual.
However, Teralytics claims no such re-identification risks are attached to its system because of how it's baked privacy considerations into the design. Polzer says it's using a variety of proprietary techniques to handle the data in a way that preserves user privacy -- although he won't go into too much detail, claiming commercial sensitivity. But says the system has passed muster with strict German data protection watchdogs, and expresses confidence it's robust enough for any data protection regime.
One key aspect is that as well as anonymizing the data they also claim they are never linking data traces to individual identities -- rather they only provide analyses based on aggregation of groups' movements and habits. They also perform analysis of the data on site, behind carriers' firewalls, to reduce potential security risks -- so they're not lifting subscriber data elsewhere for processing.
"We are already fully compliant with GDPR," says MacLeod, referencing the incoming European Union data protection regulation that's bringing in new privacy requirements for companies handling EU citizens' personal data, as well as ramping up penalties for privacy violations.
"As an extra measure in Germany we are rehashing every 24 hours. But of course you still want to do long term profiles so we have developed a technique to actually still do that and be compliant and getting approval by the Germany privacy regulator for that," adds Polzer.
Clearly the hope is that their approach has been sensitive enough and robust enough to entirely defang any privacy concerns, regulatory or otherwise, though a lot may depend on the perception of the mobile subscribers' whose data is ultimately fueling these commercial insights. (Which may be why the initial go-to-market strategy is focused on a goal that can be perceived as socially beneficial -- after all, which good citizens doesn't want to live in a 'smarter' city?)
In the case of Telefonica Germany, one partner Teralytics will name, Polzer says the carrier is providing an opt-out for users who do not want even anonymized details about how and where they travel and which apps and mobile websites they're looking at, to be used for third party analytics.
Though clearly not every carrier it works with might decide to offer the same choice to its subscribers.
"Of course there are some slightly relaxed rules [in some telco jurisdictions]," concedes Polzer. "On the other hand we need to invest in developing an algorithm that works outside Germany... We can't afford building a new algorithm for every single country. And also, to be frank, we very much view GDPR as the future -- we expect every regulator to, in the end, move in that direction anyway. So I don't think we're building a business that hopes for loopholes or depends on loopholes."
"We build the same privacy standard into the solutions we build, regardless of what the law does or doesn't require in that country," adds MacLeod.
Doing advertising in an opt-out way — we don’t think that’s really sustainable in the long term.
Zooming out, to consider the telcos themselves, why do they need Teralytics? MacLeod demurs on this question, saying its partners don't "need" it -- given they do have their own in-house analytics teams -- but rather the sales pitch is around strategic focus; with telcos being most concerned about optimizing their own business processes, whereas Teralytics can offer itself as the "young, fast, flexible" startup partner which can be out in the market selling services to third parties to make more of carriers' data holdings, as well as also supporting them to drive more of their own core revenue if they wish.
The basic business model is a revenue share with carrier partners on any third party analytics deals Teralytics (or the two combined) are able to cut -- though MacLeod won't go into specifics, beyond billing the proposition as a "low cost, low risk, easy way into big data analytics" for telcos to eke more value out of their data holdings.
Polzer also points to the market constraints of telcos as a helper here -- noting this characteristic means they're not well-positioned to recoup the kind of investment needed to build a comparable machine learning analytics platform. Whereas Teralytics can invest because it can play and (it hopes) scale across multiple markets.
"A lot of the partners we've got now and some of the ones we're talking to now they fundamentally believe that data is the new gold and it's going to be the new currency," says MacLeod. "We have value to add because we're really good at what we do and it's hard, especially when you consider not only the complexity of the machine learning that has to go into providing these insights, but... the privacy -- this isn't something that you can just go hire a bunch of data scientists off the streets and do it."
Of course one market that some telcos are demonstrably very keen on expanding into, based on how much they can infer about their customers, is digital advertising.
Just this week U.S. carrier Verizon, for example, announced a rewards program for its subscribers that requires them to agree to share personal data (such as their browsing habits) with its digital ad division, so expressly for marketing purposes, in exchange for the ability to earn loyalty rewards. Ergo, it's gunning to build up an ad targeting empire -- a la Google and Facebook. (And for that reason recently spent big to gobble up veteran digital ad player, Yahoo.)
So is helping carriers enhance their ability to target ads at their users something that Teralytics wants to do too?
"At the moment it's not the focus," says Polzer, after a slight hesitation. "Of course we are getting approached by operators about this topic, but at the moment it's definitely not the focus."
"And in most of the territories we operate in it's not allowed anyway, so it's relatively straightforward," adds MacLeod.
Might the startup look at moving into that line of business in future -- if/where regulatory conditions are favorable? "Our focus for the business is clear and it's not that," returns MacLeod. "The guys started out, long before I turned up, wanting to uncover insights into human behavior. And even though I'm here now, and I'm the commercialization guy, and we're looking at other sources of data besides telco data, the intention is to build on that. We see lots of interesting big scale trends in terms of how people move around differently, how people live differently... There's so much huge interesting stuff emerging from that that's why we've placed the focus on smart cities and transportation, and the intersection of it."
"I'd never say never to anything but I can tell you with absolute certainty it's not our focus," he adds. "We are privacy by design throughout so whatever we do we'll never go anywhere near anything that breaches people's privacy because it's literally built into the fabric of the company."
"Doing advertising in an opt-out way -- we don't think that's really sustainable in the long term," continues Polzer, when I press on how interested telcos are in growing digital advertising businesses, going on to suggest that the point at which Teralytics might apply its platform to a digital advertising use-case would be "if telco operators are able to build a meaningful opt-in base" (i.e. for individually targeted marketing).
"Which we haven't seen in any market yet," he adds. "I do hope and wish all the telcos luck and success in making this transformation. But I think they would have to build up a meaningful opt-in base for us to play a role there. But once they're ready, I think we are probably the best providers of human mobility data."
One of Teralytics' ongoing partner conversations is "specifically about an opt-in case", adds MacLeod. "So we're looking at this, it's very early stage for us. We haven't decided to work with the partner at all -- let alone whether we want to participate with that case. But for me the concerns go away around why not to do it if everybody who's potentially is going to be targeted by a solution has positively said yes I would like to be. Rather than they haven't got around to saying that they wouldn't like to be."
Whatever the outcome of that particular carrier conversation, right now, the business goal for the team at this stage of business development -- several years in, with multi millions raised and what it's pitching as a solid platform under its feet to get carriers to jump on board -- is accelerating commercialization. The plan is to dial up sales and customer acquisition by building out the commercial team and putting its energy into front office ops for the coming year. Aka "it's time to really ramp this thing," as MacLeod puts it.
"In the next 12 months our plans are to accelerate in the markets that we have -- so we're active now in four markets already, as in we've got four significant live partners in the US, in Germany, in Singapore and in Hong Kong," he says. "We are actively talking to partners that would give us either two or three territories so we will increase the geographical footprint because the platform itself is so replicable and so scalable and the way that they've built it, even though the nature of our relationship is slightly different it's... not very far away from plug and play. From the day that we decide to do it it's two or three weeks until we can be live with some insights in any market."
"A big chunk of our platform is highly productized," adds Polzer. "We are very flexible in the way of extracting number of subway trips vs number of car trips. One customer might come with a question: 'how often do people take the train?' -- another customer say: 'what happens if I reconstruct a bridge?' and for us these might be different customer questions but the underlying analysis is the same."
The underlying question for Teralytics' big data play is what will mobile users say? Will they feel comfortable if their carrier decides to track and analyze their personal data for commercial gain? Providing a stable and reliably affirmative answer there may prove to be this startup's biggest challenge.