HubSpot Data Management: 100 Projects, 7 Honest Lessons

Written by David Koehler | Jun 2, 2026 5:38:54 PM

Over the past few years we have looked after well over 100 HubSpot portals. Most of them look similar on the inside. The teams work carefully, and yet the same things keep going wrong with data management. Things almost nobody talks about openly.

Most guides explain which features you need to use. Merge duplicates, set required fields, set up a couple of workflows. That is all correct, and yet six months later the portal is cluttered again. The more interesting question is why that happens and what it costs when your data base is not right. Reports nobody in the meeting relies on. Budget that flows into dead contacts. Leads that no one can say anything useful about anymore.

We will walk through the points we run into in almost every portal. You will probably recognise one or two of them.

What clean data management really means

You probably already know the basics. Merge duplicate contacts and companies, define which fields are required for each object type, use dropdowns and fixed field types on input instead of free text, and automate recurring corrections via workflows. HubSpot ships its own tools for this, from duplicate management to the data quality command center.

HubSpot's data quality command center brings duplicates, formatting errors, and property insights together in one place. Source: HubSpot Knowledge Base

That is the mandatory part. But it is not enough, because it does not solve the real problem. A portal does not get messy because nobody knows how to merge duplicates. It gets messy because new data keeps flowing in every day and nobody makes sure it stays clean. This is where the actual work begins.

The seven most common problems

On their own they look harmless. Together they make sure that, over time, nobody trusts your portal anymore.

Hundreds of properties nobody understands anymore

In almost every portal that has been running for more than two years we find several hundred properties. A large share of them is no longer used anywhere, in no list, no workflow, no report.

This does not happen out of carelessness. Every new campaign, every connected tool, every quick request from a team creates a new field. Things are rarely deleted, because nobody is sure whether the field is needed somewhere. So the list grows year after year.

For you as a marketing lead, the problem is not the volume but the ambiguity. When there are three fields for the same piece of information, nobody knows which one holds the truth. Reports pull from the wrong field, segments become imprecise, and the numbers in the meeting depend on who built the report.

What helps is a regular look at your properties. HubSpot shows you which fields are unused and which ones overlap. Those fields should be merged or removed, and new fields need clear naming rules. Cleaning up once is not enough, because new ones keep appearing. It takes a fixed routine.

Nobody is really responsible for the data

When we ask who in the company is responsible for data quality, we usually do not get a clear answer. Marketing assumes that sales maintains its contacts. Sales assumes the system already handles it correctly. In the end, nobody feels responsible.

That is the real reason why standards exist on paper but are not followed day to day. Everyone maintains things a little differently, so in the end nobody maintains them properly. Without clear ownership, even a well-built portal loses quality over time.

What helps is clear ownership. A person or a team that is responsible for the data, and a clear rule about which system has authority over which field. In many companies this role does not exist internally, because nobody has the time or the HubSpot knowledge for it. In that case it makes sense to hand the task to an external partner.

Lifecycle stages that no longer match

The lifecycle stage is meant to show where a contact stands in the process, from lead to customer. In practice it is often wrong. Existing customers sit in the system as leads, old leads suddenly show up as customers, and nobody knows how that happened.

The causes are usually manual edits and workflows that contradict each other. On top of that comes a detail many people do not know. When you merge two contacts, HubSpot keeps the lifecycle stage that is furthest along in the process. So if you merge a fresh lead with an old customer record, it quietly becomes a customer again, even though that is factually wrong.

For reporting, this is fatal. Your funnel analysis then shows numbers that have nothing to do with reality, and the handover between marketing and sales gets muddled because the stages are not reliable. Anyone who wants to cleanly separate MQLs and SQLs first needs clean lifecycle stages.

What helps are clear definitions for each stage, automation that sets them consistently, and a regular eye on whether the distribution still looks plausible.

Duplicates that keep coming back

Almost every company has cleaned up its duplicates at some point. And almost every time, they are back a few months later. That is because duplicates are not a one-time state but are created continuously. Through forms with slightly different email addresses, through imports, through integrations without proper matching rules. Merging once does not solve the problem, because the inflow continues.

HubSpot's duplicate management suggests likely duplicates to merge or reject. Source: HubSpot Knowledge Base

There are two things about merging itself that you should know. First, a merge in HubSpot is final. There is no undo button, and not even HubSpot support can reverse it. Second, for most fields it is not the primary record that wins, but the most recently changed value. So anyone merging quickly and without checking can easily overwrite good data with worse data.

It gets especially tricky when HubSpot is connected to another system such as Salesforce. If a contact is merged on the other side, it can disappear from HubSpot before its history transfers to the surviving record. This creates gaps in attribution without anyone noticing.

What helps is to stop duplicates where they originate. Set up forms and integrations so they recognise existing records instead of creating new ones, define clear merge rules, and treat the whole thing as a fixed routine rather than a one-off action.

You pay for contacts you are not allowed to email

This point costs most companies money every month without anyone noticing.

In HubSpot you do not pay for all contacts, only for marketing contacts. But every new contact from a form, an import, or an integration is automatically created as a marketing contact. If someone later unsubscribes or the address is dead, HubSpot does stop sending that person emails. It still does not downgrade them, though. So you keep paying for someone you can no longer reach.

With a thousand such dead records, several thousand euros a year quickly add up, lost with no value in return.

This can be solved with two changes. New contacts enter the system as non-marketing first instead of automatically as marketing contacts. And a workflow automatically downgrades contacts as soon as they have been inactive for 180 days. In many portals this cuts the HubSpot bill by 30 to 50 percent. The only thing that matters is timing. HubSpot sets the billable number on the first of the month, so changes only take effect the following month. If you want to save, clean up before the month rolls over, not after.

Reports nobody trusts in the meeting

At some point you are sitting in a meeting and two people show different numbers from the same HubSpot. Or attribution credits a channel with wins you know are not real. This is rarely a reporting problem. It almost always comes down to the data underneath.

A common reason that hardly anyone has on their radar is forms. HubSpot forms overwrite existing values by default. There is no setting along the lines of “only fill in if the field is empty”. So if a known lead fills in a second form, the new entry overwrites the original source. Where the lead originally came from is then gone.

Even more awkward is a special case. If someone fills in a form using the same browser a different person used before, or clicks through via a forwarded marketing email, HubSpot assigns the new entries to the old contact and overwrites its data. The reason is that the association runs through a cookie, not through the person.

For you, this means that exactly the question you care about most becomes unanswerable. Which channel actually drives revenue. If the source is unreliable, you end up deciding on gut feeling, even though you have a full CRM.

What helps is to protect the important fields. The original source is set once and not overwritten afterwards, important entries are only updated when they are still empty, and form settings are reviewed deliberately rather than simply taken as they come.

The big cleanup day that achieves nothing

Many companies tackle the topic once a year. Someone blocks out a day, cleans up, and three months later the portal looks the same as before. The reason is simple. The inflow of new data and the decay of the old never stop. A one-off action simply cannot win against that.

Formatting errors can be corrected with automated rules in HubSpot, instead of fixing them by hand. Source: HubSpot Knowledge Base

On top of that comes something people rarely say. The genuinely effective tools sit in the higher licence tiers. Automated formatting via workflows and the data quality command center require at least Operations Hub Professional, and AI-powered duplicate detection even requires Enterprise. Particularly useful are rules that fix formatting errors not just on existing data, but automatically on every new record that comes in. Many portals struggle by hand without the right tier, while others pay for a tier whose tools were never switched on.

Both lead to the same result. A lot of effort, recurring chaos, and in the end money spent on something that is not being used.

Data management is not a project, it is an ongoing operation

When you line the points up next to each other, something becomes visible. None of them can be solved once and then ticked off. Data ages constantly because people change jobs, companies relocate, and addresses die. According to MarketingSherpa, B2B contact data decays by around 2.1 percent per month, or roughly 22.5 percent per year. In fast-moving industries such as technology the figure is even higher, and the average B2B contact changes jobs about every 18 months. So your data base continuously loses value, no matter how thoroughly you clean up once.

This is not merely a hygiene question but a revenue question. Gartner estimates the cost of poor data quality at an average of 12.9 million dollars per company per year. That figure is pulled up by large enterprises; for a mid-sized company it sits proportionally more in the six-figure range per year. Harvard Business Review puts the damage at 15 to 25 percent of revenue. And in Validity’s State of CRM Data Management 2025, more than a third of respondents said they had lost revenue directly because of poor data quality. In short, your data is an asset that loses value the moment you stop maintaining it.

So the right question is not how you get your portal clean once, but who makes sure it stays clean, and with what routine. This is exactly where most internal solutions fail. Not on skill, but on the fact that nobody carries the responsibility and has the time to look after it continuously.

An ongoing operation is the answer. Instead of one-off cleanup projects, the work of keeping data, workflows, permissions, and compliance in order happens month after month, so your portal stays clean and you can focus on marketing and sales.

The Swiss special case

A word on compliance, because it plays a particular role in Switzerland. Clean data is not only a question of good reporting but also a legal obligation. Opt-out requests have to be honoured reliably, deletion periods kept, and access logged in a traceable way. Under the revised Swiss Data Protection Act (revDSG), specific requirements apply that do not match the GDPR on every point.

Anyone operating in Switzerland and the EU has to meet both frameworks at the same time. In practice that means setting up consents, deletion periods, and access rights so they satisfy revDSG and GDPR at once. Cleanly maintained data is the basic prerequisite for this, because deletion periods and opt-out status can only be enforced reliably when the data base is right.

Frequently asked questions

What does data management mean in HubSpot?

Data management covers everything that keeps the data in your portal correct, complete, and consistent. That includes merging duplicates, maintaining properties and lifecycle stages, fixing formatting errors, and monitoring integrations.

How often should you maintain HubSpot data?

Ideally on an ongoing basis rather than once a year. A short monthly routine for duplicates and obvious errors, plus a more thorough quarterly review, keeps the portal clean over time. A one-off cleanup fizzles out because new data keeps flowing in.

Which HubSpot tools help with data management, and which licence do you need?

Duplicate management is available on all tiers. Automated formatting via workflows and the data quality command center require Operations Hub Professional, and AI-powered duplicate detection requires Enterprise. Which tier makes sense depends on how much you want to automate.

What does poor data quality cost?

Both directly and indirectly. Directly, you pay for marketing contacts you no longer email, for example. Indirectly, costs come from wrong reports, wasted budget, and decisions made on shaky ground. Harvard Business Review puts the damage at 15 to 25 percent of revenue, and Gartner at an average of 12.9 million dollars per company per year.

How do you stop duplicates from coming back?

By stopping them where they originate. Set up forms and integrations so they recognise existing contacts instead of creating new ones, and handle merging by clear rules and as a fixed routine.

Can HubSpot be run in a revDSG- and GDPR-compliant way?

Yes. The key elements are clean consent via double opt-in, enforced deletion periods, and traceable access logs. If you operate in Switzerland and the EU, you should cover both frameworks together.

Stay clean instead of cleaning up

A portal nobody maintains gets more inaccurate month after month, until nobody in the meeting trusts the numbers anymore. Avoiding that does not take a bigger cleanup; it takes a reliable ongoing operation. As a HubSpot Diamond Partner with more than 100 projects under our belt, we see every day what keeps a portal healthy and what quietly erodes it.

Want a second pair of eyes on your HubSpot data?

Data quality isn't a project you finish. It's an operation someone has to run. That's exactly what we do with HubSpot Admin-as-a-Service. Month by month, at a fixed price, with clear response times and a dedicated team behind your portal.

If you'd like to see what that would look like for your portal, let's have a 30-minute call to walk you through.

View full post