Print

Print


Lee,

Porting to Java for inclusion in the toolkit is a good idea.  Here's a stopgap that will send most any Java programmer shrieking in horror, but the Perl module Inline::Java::PerlInterpreter [1] provides a jar that allows you to call the Perl interpreter through Java.  I guess you could execute Date::Manip from there until it could be ported.  I'd be really curious to see if it works, and granted, it might be too experimental for the toolkit.

I think what Date::Manip provides would make a really nice, simple web service, regardless of the language.

[1] http://search.cpan.org/~patl/Inline-Java-0.52/Java/PerlInterpreter/PerlInterpreter.pod

Clay

~~~~~~~~~~~~~~~~~~
Clay Redding
Digital Project Coordinator
Network Development & MARC Standards Office
Library of Congress
LA308, Mail Stop 4402
101 Independence Ave. SE
Washington, DC 20540
[log in to unmask]
202-707-7196 
~~~~~~~~~~~~~~~~~~

>>> Lee Mandell <[log in to unmask]> 7/12/2007 7:33 AM >>>
Has anyone ported this to Java? I would really like to incorporate  
the normalization into the Archivists' Toolkit as part of our import  
process.


Lee Mandell
Design Team Manager
Archivists' Toolkit
New York University
617-666-8486
cell: 617-512-0194
e-mail: [log in to unmask] 

"where are we going and why am I in this handbasket?"


On Jul 11, 2007, at 12:45 PM, Jason Casden wrote:

> Hi all,
>
> It's great to hear that people are still interested in this script!  
> Sorry for any confusion in the installation instructions. Let me  
> try to clear it up a little bit. It sounds like you all have been  
> able to install ActivePerl (by the way, it should be safe to  
> install the most current version), so the steps after that should  
> go about as follows:
>
> 1) Go to your Start Menu, the "ActivePerl 5.8...." folder, and run  
> "Perl Package Manager"
> 2) Click the leftmost icon on the menu bar ("View All Packages")
> 3) Type XML-Twig
> 4) XML-Twig should show up in the main display area. Right click on  
> it and select "Install XML-Twig."
> 5) Go to the File menu and select "Run Marked Actions."
> 6) Exit.
>
> This process is a little more user friendly than it was when the  
> script was released. Also, it looks like HTML::Entities is packaged  
> with ActivePerl already. Now...
>
> 7) Create a directory for the date normalizer script.
> 8) Put copies of the finding aids you want to work on in that  
> directory.
> 9) Put the script in that directory.
> 10) Run the script by double clicking it, and then following the  
> directions.
>
> Hopefully this will work for everyone. Please let me know if you  
> have any other problems with it. Also, if you notice any bugs or  
> have ideas for possible enhancements, I am happy to try to improve  
> the tool.
>
> Regarding Date::Manip, I am pretty sure there were some good  
> reasons that I ended up doing the date manipulation manually (using  
> regular expressions), but they aren't all available to me right  
> now. I know one issue was the impressive creativity of the people  
> who entered dates originally, which gave us a lot of formats that  
> were either ambiguous, not usable by Date::Manip or which contained  
> information (like question marks) that we wanted to capture somehow:
>
> 12-1/1979
> 10-1978
> 1952?
> 1950's
> 195?
> around 1977
> I don't know
> undated
> not dated
> n.d.
> the 50's
>
> It's probably possible to simplify the script by using Date::Manip  
> to work with the more standard-ish dates, but I ended up using  
> regex the whole way.
>
> By the way, this script was the brainchild of Amy McCrory at Ohio  
> State, so there are some decisions the script makes in ambiguous  
> situations that are in line with practices at OSU that may not  
> agree with those of your institution. In most situations where  
> there isn't an obvious way to normalize a date, however, the script  
> asks the user for their own interpretation.
>
> And, again, please don't hesitate to get ahold of me with any  
> questions.
>
> Jason
> -- 
> Jason Casden / [log in to unmask] 
> Digital Projects Librarian
> Ehrman Medical Library, NYU School of Medicine
> v: 212-263-8935  f: 212-263-6534
>
> On 7/11/07, Michele Combs < [log in to unmask]> wrote:
> I'm attempting the exact same thing, any offlist replies please cc me
> :)
>
> Michele C.
>
> >>> [log in to unmask] 7/11/2007 11:15 AM >>>
> This is something I have been meaning to attack as I am in exactly the
> same position as Deena (i.e. wanting to normalize my dates but  
> having no
> immediate need to do so.)
>
> Following Joseph's post, I have installed ActivePerl 5.8.3.809 (on a
> Windows2000 machine.) I have also downloaded XML::Twig and
> HTML::Entities modules and the script tri-XMLdate-normalizer.pl  
> from the
> SAA website.
>
> But I have no knowledge of Perl or any clue what to do next. Is there
> any online guide to such things that will instruct me what to do?
> Failing that, can anyone here enlighten me? For instance, in what
> directory do I place the modules (or what do I do with them) to make
> sure they are connected? And then how to I make the whole thing work?
>
> I'm sure it's pretty simple, but a little guidance will be much
> appreciated.
>
> Thanks in advance,
>
> Jonathan Lill
> Project Archivist
> The Museum of Modern Art
> Museum Archives
> 45-17 32nd Place
> Long Island City, NY 11101
> 212.333.6514
> [log in to unmask] 
>
>
> -----Original Message-----
> From: Encoded Archival Description List [mailto: [log in to unmask]]On Behalf
> Of
> joseph greene
> Sent: Wednesday, July 11, 2007 4:22 AM
> To: [log in to unmask] 
> Subject: Re: Normalization of Dates - clarification
>
>
> We are using the Perl program listed on the SAA's Tools and Helper
> Files
> webpage, at https://www.archivists.org/saagroups/ead/tools.html .
>
> The program is called tri-XMLdate-normalizer.pl (
> http://monkey.org/%7Ecaz/TRI-scripts/tri-XMLdate-normalizer.pl ) and
> works beautifully. Once you have it set up, which is quite easy once
> you
> understand how the Perl processor works, you can normalize
> (@normal="iso8601 value") dates in a finding aid within minutes.
>
> Good luck.
>
> Joseph Greene
> Irish Virtual Research Library and Archive Project (HII),
> James Joyce Library,
> UCD,
> Belfield,
> Dublin 4.
>
> (t) 01 716 7506
> (e) [log in to unmask] 
> (w) www.ucd.ie/ivrla 
>
> ----- Original Message -----
> From: Richard Davis <[log in to unmask]>
> Date: Tuesday, July 10, 2007 11:46 pm
> Subject: Re: Normalization of Dates - clarification
> To: [log in to unmask] 
>
> > Michele Combs wrote:
> > > As far as the way the visible date is written, I would think that
> > you> need not bother changing that at all unless you have more time
> > and money
> > > than you know what to do with; we don't usually alter the way the
> > dates> appear in legacy finding aids when we do conversion, unless
> > for some
> > > reason it affects the usefulness of the finding aid ( e.g. if the
> > format> is so vague as to be uninterpretable or ambiguous enough to
> > lead to more
> > > than one interpretation).
> >
> > Hi
> >
> > Just thought I'd say that normalising dates needn't be a completely
> > manual and painful process: programming can come to your (finding)
> > aid.
> > With a comparatively simple script one could parse EAD files,
> > isolate
> > the non-normalised date elements, and generate new normalised
> > dates.
> > Perl's DateManip module, for example,  can reliably identify a wide
> > range of dates in vernacular forms, and output them in ISO8601 or
> > what
> > you will. You'd still need to verify the output, but if there's
> > nothing
> > too kinky, it might reliably do the lot.
> >
> > http://www.cise.ufl.edu/~sbeck/DateManip.html#examples 
> >
> > Little scripts can speed things up a lot - try to find a friendly
> > hacker
> > to write you one! :)
> >
> > Normalising dates may not be a pressing imperative, if your present
> > system merely displays them, but posterity is likely to be grateful
> > if
> > it starts wanting to sort or search collections, or do other
> > analysis,
> > based on date-like properties.
> >
> > Hope this helps
> >
> > Richard
> >
> >
> > --
> > / Richard M. Davis
> > \ Digital Archives Specialist
> > / University of London Computer Centre (ULCC)
> > \ 20 Guilford Street, London WC1N 1DZ
> > / +44 (0) 20 7692 1350
> > / [log in to unmask] 
> >
>