2020-12-07 DM - Metadata duplicate management
Date
Dec 7, 2020
Attendees
@Mandy Chessell
Goals
Review the proposal for upgrading the existing duplicate detection support in the discovery server to cover
Discussion items
Time | Item | Who | Notes |
|---|---|---|---|
10min | Introductions | Maryna |
|
40mins | Modelling duplicates | Mandy |
|
50mins | Engine Services | Mandy |
|
20 mins | Next Steps | All | Discussion |
Notes
De-Duplication:
Issues with current model:
Multiple IGCs synced with Egeria.
Objects and their relationships change over time.
currently only for assets.
New approach in stewardship and discovery.
record fingerprint of metadata, for detection in discovery.
Use OMASs for previously defined elements.
Use
KnownDuplicateLink/KnownDuplicatein repositories will merge relationships.With logic for cascade delete/anchors/etc.
This can be automated with engines.
Combining
KnownDuplicateto merge withKnownDuplicateLinkswithtype: Consolidatedandstatus: APPROVEDin resulting-linkage.Use stewardship to resolve or invalidate bad information in linked.
Suggest we don't copy relationships from host repository, but link.
open metadata types
same
qualifiedNamevalues for two assets.
Engine Services:
consolidate discovery/stewardship into engine host.
flow:
Engine service, engine type, request type, type of service, action.
Discovery (
Asset Analytics) for actions from inside of actions.WatchDog watches changes in metadata looking for problems
de-duplication work
classifications
creates request-for-actions in
Triage
TriagefromWatchDogrequest for action for automatic or manual interaction.Remediation gets issues from
WatchDogand receives interactions in automated/manual actions.Scheduling for things needed from regulation or automation based on time. request-for-action
Events listening in each engine and progression.
Discovery engine work with services such as quality, annotations, etc.
Open Discovery Analysis Report
WatchDog triggered on completed ODAR reports from above.
OMAS partnered with service invocation.
Triage uses rules/triggers/actions to move state/events
There is choice to send directly to remediation based on status.
Scheduler
effective dateor actions on dates.Asset Provisioning event for external event or action or engine internal or external.
Metadata: 0461 Governance Action Engines
We could create an archive to target specific domains/package to add capability.
0462: Governance Action Types
GovernanceActionTypecurrent to next or define a pre-defined plan based on conditions.SoftwareServerCapibilitylink to engine to take action
0463: Governance Actions
Audit record of actions take in governance.
GovernanceActionasset/referenceableworking on, action type, engine, and activity.highlight where the stewardship decisions are made.
Asset Analysis
Engine Host OMAG Server
Review Material
Action items