Atlassian uses cookies to improve your browsing experience, perform analytics and research, and conduct advertising. Accept all cookies to indicate that you agree to our use of cookies on your device.
Atlassian uses cookies to improve your browsing experience, perform analytics and research, and conduct advertising. Accept all cookies to indicate that you agree to our use of cookies on your device. Atlassian cookies and tracking notice, (opens new window)
Share use cases and design of Egeria's support for lineage of file processing
Discussion items
Time
Item
Who
Notes
5mins
Welcome
All
45 mins
File Lineage
Mandy
Attachment reviewed:
Unique challenge with files is that the Asset you're interested in could be at various levels:
could be just a file itself (without much concern for the folder it appears within)
or it could be all of the contents within a folder that is more of interest (eg. if the folder contains something like rolling logs, or a number of files where each contains a daily snapshot of information, etc) (DataFolder)
important to treat these distinctly for various reasons:
behaving differently in lineage, eg. avoiding showing a fanning-out of many daily snapshot files when it's really one holistic dataset that happens to have daily snapshots within it (in the case of a DataFolder)
need different connector types to be able to read their contents differently (one directly reads files, the other needs to combine the contents of all of the contents of the files)
but also to ensure that the DataFolder actually extends FileFolder (so that a given instance is actually both, and can therefore be consumed in both ways depending on the user and their needs in accessing it)
Discussed classification of elements (like files) that may be deleted but still need to be present in the lineage
for example, a landing file that is picked up by a process and moved elsewhere – we need to know about the landing file (even though it's been deleted) to show the lineage all the way back to its ultimate source
we believe there is a relatively common industry term to represent this concept: Tombstone
however, we believe this could be considered a sensitive trigger term, and therefore are in favour of some other term to classify such elements in lineage: suggesting we go for Memento