|Purpose |||Important Dates |||Submissions |||DESI History|
|Program |||Organizing Committee |||Program Committee|
The DESI VII workshop will provide a platform for discussion of best
practices and innovations in the use of advanced search technology,
text classification, language processing, data organization,
visualization and related techniques for the purposes of accessing and
managing electronically stored information. One focus of the DESI VII
workshop will be on emerging protocols and novel techniques for
identifying and protecting sensitive information in large collections.
The workshop will also welcome contributions on other topics that are
within the workshop’s broader scope. We expect the refined focus on
protecting sensitive content this year to be directly relevant to at
least four application contexts:
In eDiscovery: What techniques are currently being used to classify information found in email or other data sources as privileged, confidential, or otherwise protected by law? How widespread is the use of technology for this type of information identification? How well do current technologies perform with respect to the classification of sensitive information?
In EU privacy policies: To what degree can current algorithmic techniques adequately characterize content that individuals might wish to have blocked from certain types of access in adherence with “right to be forgotten” laws? To what extent can the process of adjudicating such requests reasonably be automated? How well do algorithmic techniques perform in identifying sensitive data that may need to be blocked from cross-border transfers? To what extent can these capabilities satisfy requirements for algorithmic accountability?
In audits and investigations: What tools and techniques are available to find and protect well-defined categories of sensitive content? Examples from the US and Canada might include protected health information, student education records, customer record information, card holder data, or proprietary or confidential information (e.g., trade secrets). To what extent can taxonomies be constructed for information that is routinely the focus of internal audits to facilitate automatic detection of those categories of information? To what extent can technical support for investigations be designed to protect sensitive content that is not material to the investigation?
In public access requests: How well can current procedures and automated techniques identify and protect personal, political, proprietary or otherwise confidential content? To what extent can automated techniques reliably detect specific types of personally identifiable information which, if released, would constitute an unwarranted invasion of privacy?
The workshop discussion will be grounded in the results of original research, such as that reported in interdisciplinary venues such as ICAIL, law reviews, technical conferences in specific disciplines (e.g., KDD, ICWSM, ACL, SIGIR), and shared task evaluations (e.g., TREC, CLEF, NTCIR).
Participation is invited from all interested parties, including those with backgrounds in:
Submissions should be sent by email to Jack Conrad (jack.g.conrad (put at sign here with no spaces on either side) tr.com) with the subject line DESI VII RESEARCH/OPERATIONALPRACTICE PAPER or DESI VII POSITION PAPER. All submissions received will be acknowledged within 3 days.
A PDF of the second Call for
Submissions is also available.
DESI VII follows five successful prior DESI (Discovery of
Electronically Stored Information) workshops: at ICAIL 2007 (DESI I, Palo
Alto), ICAIL 2009 (DESI
III, Barcelona), ICAIL 2011 (DESI IV,
Pittsburgh), ICAIL 2013 (DESI V, Rome), ICAIL
2015 (DESI VI, San Diego), and an intermediate
II) at University College London in 2008. In DESI I, a wide array
of individuals came together for perhaps the first time to foster
engagement between e-discovery practitioners and a broad range of
research communities who might contribute to the development of new
technologies to support the e-discovery process. The DESI II and III
workshops broadened the scope of this discussion to include
comparisons of requirements between differing national settings and
legal environments. DESI IV built on these efforts, in having a
first-of-its-kind general discussion of standard-setting for the legal
profession through contemplation of ISO 9001 frameworks as well as
capability maturity models. Most recently, DESI V extended the
discussion of standards to include the question of what standards
could and should be made applicable to the use of predictive coding
and other advanced techniques, that were at the time beginning to be
cited in U.S. case law. The DESI VI workshop in San Diego aimed to
broaden the scope of legal issues to which advanced data analysis and
classification technologies might credibly be applied, beyond
ediscovery to a fuller range of information governance applications.
Keynote address: Maura Grossman and Gordon Cormack (University of
Waterloo) will speak on "Selective Digital Amnesia." An abstract is available
Invited speakers: Tim Gollins (National Records of Scotland) and Craig Macdonald (University of Glasgow) will speak on "Assisting Digital Sensitivity Review of Government Records." An abstract is available.
Further information about the program will be posted here as papers are