Newsletter 31

Newsletter 31

February 2003

Contents

What is quality data?

Data quality issues are central to the success of the National Biodiversity Network (NBN). Unfortunately, we seem to struggle with words in the English language that describe the concepts in a succinct way. The use of unfocused and inelegant language can in turn hinder both debates within the profession and communication with users of biodiversity data. At a time when the NBN is seeking to define a metadata standard this short article aims to stimulate a debate on suitable words for the concepts in the hope that we may be able to reach agreement and improve communication. The majority of biodiversity information users are implicitly asking the question “To what extent does this data set provide a perfect picture of the resource, of this/these species/habitat(s) in the area in which I am interested?". No data set can achieve this perfect picture. The obligation of the data set provider therefore is to describe the degree to which the data set departs from perfection so that fit-for-purpose can be assessed by the user.

Several aspects of distortion of the perfect picture can appear in any data set. The picture may be “grainy” because of vague geographical referencing; it may have changed since the picture was taken; there may be mistakes and some crucial parts of the picture may be missing. I suggest that the most suitable words to describe these attributes are precision, currency, reliability and completeness. Precision is primarily concerned with the geographical precision used for recording and capturing the data. A 100 metre square record clearly has more potential applications than a 10 kilometre square record. Taxonomic precision, a species record is more useful than a record at the family or order level and a precise date may also be relevant.

Currency relates to the time elapsed from the present to the date or period at which the data was collected. By definition it declines over time for all data sets other than those that are continually updated.

Reliability is concerned with the well-established concepts of verification and validation and is a measure of the strength of these processes. Validation may in turn include related concepts such as the geographical accuracy of the original recorder (which of course is different from geographical precision) and the accuracy of data capture and transfer.

Completeness may well be the most useful attribute for fit-for-purpose assessment but is also the most difficult to describe. Ideally it should be related to an assessment of the real distribution or abundance of the species/habitat in the target area in the time period covered by the data set. Unfortunately this brings in subjective judgement as the real distribution and abundance are usually unknown. Completeness also needs to be related to the geographical precision attribute.

In general terms, each of these attributes is a positive measure in that the higher the score the more likely it is that a user will find the information useful. A notable exception to this observation is the application of historical data (i.e. low currency) for particular purposes. Furthermore data sets with high scores for these attributes can often be used at lower levels if it suits the user (e.g. a precise data set can be analysed at a lower resolution), but the converse never applies. It must also be noted that the perfect data set is also impossibly expensive in resource terms, reinforcing the view that fit-for-purpose should be the aim rather than perfection. However, the underlying maxim of the NBN “collect and capture data once, use for many purposes” suggests that aiming as high as possible for each data quality attribute is desirable, as this will maximise future use of the data. Fit-for-purpose and relevance are attributes that can be assessed by the user. I would suggest that the four attributes of precision, currency, reliability and completeness are inherent data set qualities that biodiversity data managers seek to measure and provide.

Bill Butcher, NFBR Chairman

Construction of the Flora of Northern Ireland Web Site

It was decided to write a flora of Northern Ireland as a web site rather than publish it in book form because it is a cost-effective way of displaying the information held on the Recorder database at the Centre for Environmental Data and Recording (CEDaR), Ulster Museum. Over 700,000 records were collected and gathered together onto a database for the Botanical Society of the British Isles (BSBI) Atlas 2000 Project in Northern Ireland. Because of the dedicated efforts of the BSBI vicecounty recorders in Northern Ireland, the coverage for most native species was very good. Therefore, the 10km distribution maps gave a very good impression of the distribution of most species throughout Northern Ireland.

The general principles of writing a flora in the form of web pages are the same as those that would pertain if it were being published as a book. A decision must be made as to which species are to be included and level of detail for the distribution maps. A design for displaying the species information needs to be developed into a standard format. Decision-making concerning the use of colour is different for web sites as opposed to publishing on paper. Including colour photographs when printing a flora has a big impact on the cost of the publication, whereas it is effectively free when publishing via a web site (although graphics do take up room for storage and bandwidth for display, both of which cost money).

Newsletter 31a.JPG

Only certain colours are considered to be "web-safe" i.e. likely to display on any monitor the way you intend. This is less of a problem with new computers as the standard in 2002 is for displays that are called 'true colour'. These display 14.7 million individual colours, or 256 levels each in red, green and blue, which gives a better range of colours than commercial colour printing. There are also advantages to the type of image that can be used. For printing, an image has to be available as a slide for high quality scanning whereas the level of resolution required for display on the web is much lower enabling the use of digital images and images scanned from ordinary prints. Another advantage of web sites is that they can be updated when new information becomes available. The design and layout can also be adapted as the project evolves.

This article assumes a certain level of knowledge in the construction of web pages. A basic understanding of html is required though this can be relatively easily picked up using some of the many books available. The book, "Creating Web Pages Simplified" by maranGraphics, published by IDG Books World-wide, Inc., is very useful for giving a basic grounding in the principles involved. Software programmes that write the pages for you such as Microsoft FrontPage or Macromedia Dreamweaver are available; these are both known as "WYSIWYG" packages i.e. "What You See Is What You Get". Be careful if you start to learn web page construction with packages like these as while they make relatively few demands on the designer's knowledge of html, you will then find it difficult to refine your design in the html code. We used Macromedia HomeSite to write the Flora of Northern Ireland web site. This is a pure html editing package and requires knowledge of html that takes some time and effort to learn. However, once learnt you have much more control over how your page looks and works, even if you move on to something like Dreamweaver or FrontPage for more elaborate layouts. Another aspect of some "WYSIWYG" packages to take into account is that they often write "sloppy" html, filling the text with unnecessary and duplicated tags. This standard of html will generally work when the browser being used to view the pages is MS Internet Explorer but Netscape is rather more fussy and may not display your page the way you intended. It is at this point that you need your skills in html editing. The other software needed is an image manipulating package such as PhotoShop Elements or Paint Shop Pro to prepare your images for display on the web.

In order to separate the content (text, images) from the design (html mark-up tags), we decided to make the writing of the pages automatic and database driven. After designing the format of the species pages, we programmed an output routine in Revelation Basic into Recorder which read data from fields in the Recorder database and combined this with a template. This meant that when we made a select list of all the species we wanted represented on the web site, we were able to run the routine to write the html for each species directly into a .htm file with the correct name. To link all this together we decided to call our species pages by their Recorder species numbers and for any additional image and map pages the species number would have a suffix e.g. "_m" for the map page. Images representing the species and distribution maps also had filenames based on Recorder numbers.

In the Flora web site the Recorder species number is used several times throughout each set of species pages. It is the source of the name of the page, the links between the subsidiary pages, the names of relevant images and the map. If the pages were edited by hand there would be a huge amount of repetitive work involved. It is better to get the database to do this work as this is just what computers are good at. The use of the Recorder species number also allows for easy linking with the NBN Gateway as its indexes are based on the Recorder 2000 species numbers which were in turn based on the original Recorder numbers. In this way, links to the Flora of Northern Ireland web site appear in many botanical searches of the NBN Gateway. In the next version of the site we intend to move the data out of Recorder and into Access. In this way the pages will be built dynamically by drawing information from the database using Active Server Pages (ASP) to pass the information out to a request from a browser. We will then only need to store a small number of pages (plus the database) to display the information about the thousand plus species which are covered.

An important factor that needs careful consideration is how people are expected to navigate the site. When we started writing the Flora of Northern Ireland web pages we restricted ourselves to the 30 or so orchid species found in Northern Ireland. This meant that navigating to each species was as easy as using the index of a book, as each species name was put in a table and linked to the relevant species table. However, when the full complement of species to be represented in the flora was added, the total was approximately 1,100. This meant that navigating to a species after downloading a table and finding it became impractical. Our initial solution was to create a hierarchical species tree where the species were grouped in their families. This tree is designed to look like the method used to view files and folders in Microsoft Windows Explorer. This worked as a good way of viewing the species on the site but required a level of botanical knowledge that we felt was too high. It was decided to add a “free text” search facility to allow people will little botanical knowledge to type in a name or part of a name in either a Latin or common name field and search through all the species available. Incorporating a search facility into the site had technical implications such as the type of web server on which the web site is stored. In our case it required knowledge of programming Active Server Pages (ASP) and the use of MS Access. There are other programming languages that will allow you to set up a search facility on a web site that you may prefer to use. For ASP to work, the web site and database have to be on an NT server rather than a UNIX server. It is therefore very important to choose your ISP carefully with your web design and programming requirements in mind. This kind of work can be done by a programmer but you would need to remember to count their costs into any grant applications as they may be expensive. The ASP we use is only a small amount of programming, but did require prior knowledge of Visual Basic.

It was necessary to have each element of the species pages entered into the database. We used the Local Species table in Recorder and added fields that we required including "description", multivalued "photograph" and "photographer" details. The "next species number" field was filled automatically by a specially written AREV routine working on a select list sorted in taxonomic order. All the species accounts were written by Paul Hackney, Keeper of Botany, Ulster Museum in MSWord and then edited and prepared before being added to the database. There were several things that needed to be done to these accounts before being entered into Recorder. These included manually removing the lines between paragraphs - instead a # symbol was put at the start of each paragraph and inserting $ signs at the start and end of each Latin name as text formatting cannot generally be stored in a database field. These symbols were programmed into the Arev output routine to trigger a specific action. For instance the # symbol when encountered was replaced by end and start paragraph html tags (</p><p>) therefore inserting formatting tags into the html so that the page would display on a web browser with the desired formatting. The $ symbol would be replaced by html tags to confer italicised formatting so that species names were displayed correctly. When the editing was complete the species account could be selected, copied and pasted into the local species account field in the local species table. This is done by making Recorder run in a window, opening the local species table and tabbing down to the local species account field, right clicking on the title bar of the window and choosing edit and then paste.

These result in the account being pasted into the local species account section, which will then be accessed, and the text inserted into the appropriate place when the web pages are being written through the output routine.

If a lot of text is available at one time it may be better to import the text, but this requires a thorough understanding of AREV. We should say at this point that it is quite possible to transfer all this methodology to another database, such as Access, as used by Recorder 2002. Indeed, if the species numbers are generating filenames longer than 8 characters, then the output routines would need to be capable of using long filenames, which DOS cannot do.

We decided to make the size of our maps and images standard throughout the site, keeping them square and 300 x 300 pixels for the small ones and 600 x 600 pixels for the large ones. This meant that we could write into the output routine the sizes of the expected image in each view of the species pages. This kept the process simple rather than having to have fields in the local species table to enter the width and height of each image prepared for the site. This also meant that the positioning of our text and images on the species pages was uniform and therefore we didn't have any "nasty surprises" when we output the pages.

An understanding of how to scan and manipulate images is essential for web page construction. The use of images on web sites has implications on download times when the page is being viewed online. People tend to be relatively impatient while online and move on to another site if your pages take too long to download and become viewable. It is best to keep images small and of a reasonable file size to keep download times to a minimum, with only a few images per page. JPEG compression allows colour photographs to be compressed by a variable amount, helping keep file sizes down, but this does result in some loss of information and blurring. For maps, where solid lines and only a few colours are needed, GIF files are more appropriate as they are lossless, but smaller because less information is needed per pixel.

The Flora of Northern Ireland web site can be viewed at http://www.ulstermuseum.org.uk/flora and is one of a suit of natural history sites under the umbrella of Habitas Online, the Ulster Museum Sciences Division’s web presence at http://www.habitas.org.uk.

Fiona Maitland & Bernard Picton, Sciences Division, Ulster Museum,Belfast.

The Making of the Flora of Cornwall on CD-Rom

The Flora of Cornwall was published in 1999 both as a book and a CD-ROM. It was possible to contemplate developing a CD-ROM version because all the data used for the book, and the written text, were held in electronic form. The 680,000 vascular plant records were stored in the NEW ERICA database and the text was written in the desktop publishing program, Aldus Pagemaker. The original ERICA database was a programme developed by the author for the Cornish Biological Records Unit, which closed in 1996. Several methods of producing the CD-ROM version were considered.

The simplest was to take the electronic text and save it as Adobe Acrobat file (pdf format). This creates a copy of the book as published on paper, which can be read on screen by the Adobe Acrobat Reader program. This approach works, but the result on screen is not as satisfactory as reading a book in the hand and there is no added value. Secondly, the text can be converted into html format and published on the CD-ROM as web pages, much like creating a website on the Internet. This method can be used purely to produce an electronic copy of the book; however, with careful design, and time and effort spent in creating hypertext links and adding maps and photographs, it has much more potential and becomes a more valuable resource. Little programming skill is needed and the conversion process can be achieved with relative ease, as many word processors and desktop publishing programs will do it. There is limited scope for using the database of botanical records within this type of CD-ROM, but that technology is fast developing. Brian Bonnard has successfully pioneered this approach for his Wild Flowers of Alderney and An Illustrated Guide to the Wild and Naturalised Flowers of the Channel Islands. The third option, and the one actually used, was to program an interactive multimedia CDROM from scratch using Microsoft Visual Basic 6.0. This is the most difficult option, as it requires good programming skills, but it offers the greatest flexibility and can make full use of database operations. Thus it potentially makes much better use of the available material, enabling the biological records to be interrogated and be fully integrated with maps, photographs and the text from the printed book.

Newsletter 31b.jpg
Figure 1. CD Front screen.

The pre-publication offer gave quite a tough specification to achieve as the CD-ROM software had to work with both Windows 3.1 and 95 on what would now be seen as very low specification machines. Additionally, there had to be room enough on the CD-ROM for a substantial number of photographs and a copy of the NEW ERICA database (itself 200mb in size). The software for the CD-ROM was written using Microsoft Visual Basic 6.0. This meant that all the data had to be stored in a Microsoft Access database. The Access tables can be made fully secure, and this has been done in more recent versions of the CD-ROM. In addition the data could also be encrypted, but it was not felt necessary to do this. To keep the size of the Access database small enough to fit on a CD-ROM with all the photographs etc., the fields that were exported from NEW ERICA were restricted.

The data export was straightforward once its specification had been decided, and involved writing some simple routines that exported the information from NEW ERICA as delimited text files that could easily be imported into Access using the standard Access import wizard. This process worked remarkably well, although the loss of information from the original database and structural differences between the two databases did create a number of apparently duplicated plant records. For example, when exported, a record of Fumaria occidentalis growing in a hedge and one of it growing in a field at the same site, becomes two identical records, as the habitat details are lost.

All the photographs were loaned by fellow botanists or were from my own collection. It was important to choose a maximum size for the photographs as that made programming easier, especially if the images were not larger than the screen resolution 640 X 480, which was the common screen resolution at the time. The choice of photograph size was also critical in determining the file size of the image. Prints were scanned at 300 dpi resolution and transparencies at 1200 dpi. Every photograph was resized to 480 pixels in height, sharpened using appropriate image processing software and reduced to 256 colours (including the 16 Windows colours). Brief details about the photograph were typed into an Access database table including photograph title (usually Species Name), photograph code number, who took the photograph (so that their copyright can be acknowledged) and species code number if it were a plant photograph. With over 3000 photographs on the latest CDROM, this has proven to be a very time-consuming process.

Newsletter 31c.jpg
Figure 2. Atlas screen shot.

On the first CD-ROM, there were in fact two copies of every photograph – a small one and a full screen photograph. The small one can be seen inset in the map of the species screen (Fig. 2). A mouse click on the small photograph caused the photograph to blow up to full screen size. Actually, the other photograph would now be displayed. The maps were stored as image files rather than vector co-ordinates. This was easier to program and was faster to display, but it meant the maps had to be a fixed size and there was little prospect for writing software to zoom in on specific parts of the map.

The text from the book had to be treated in several ways. The introductory chapters were saved as RTF files as that enabled the Italics to be retained and photograph code numbers could be embedded in the text. This was done so that illustrative photographs could be displayed when one scrolled down through the text to read it. The species accounts were cut and pasted from the desktop publishing software into the Access database linked to the table containing the taxonomic details. In this way whenever a species is chosen the relevant text from the book can be displayed (Fig. 2). It was also found quicker to store some indexes as text files on the CD-ROM rather than get the software to create them by undertaking searches within the database. This practise has been dispensed with in the latest version of the software.

Since the publication of the first version, development of the software has continued and a total of five versions have been produced, each including an update of the vascular plant records held on the NEW ERICA database, more plant photographs, as well as significant improvements to the CD-ROM software. This development work has taken place to attempt to keep pace with the rapid changes in PC technology and to cater for the much higher specification machines and new Microsoft operating systems.

  • Version 1.0 of the CD-ROM met the pre-publication specification
  • Version 2.0 was supplied soon after the pre-publication orders had been despatched. This short-lived version dropped the NEW ERICA database to gain space for hundreds of additional plant photographs and included a few software enhancements
  • Version 3.0 entailed an extensive re-write of the software in order to drop support for Windows 3.1 and concentrate on Windows 95 and 98. One immediate advantage was that the photographs no longer needed to be stored as BMP files, in favour of GIF files. These are much smaller in size, and so their use allowed many more photographs to be stored on the CD-ROM. Dropping Windows 3.1 also simplified the programs, as the software no longer needed to react differently according to the operating system
  • Version 4.0 saw large alterations to the software as it was changed from DAO (Data Access Objects) to ADO (Activex Data Objects) programming. This opened up many new and more flexible ways to manipulate the data and was appreciably faster. Unfortunately, it was found that, for some as yet unexplained reason, not all Windows 95 and above computers could install the software. This version allowed the option of installing the database on the PC, which is much faster than performing database operations on the CD-ROM itself
  • Version 5.0 now has 720,000 records in the database. As 17in monitors have become standard with new PCs, the screens have been redesigned accordingly.
  • All photographs were changed to JPEG format to save space, but displays set at 256 colours could no longer be supported.

Getting a Flora printed as a hardback book is a costly business, such that it is only viable to contemplate repeating that process once every 20 or so years. It cost £6000 to prepare camera-ready copy and to print and bind 500 copies. A total of 470 Floras have now been sold. A CD-ROM does have development costs over and above those involved in writing the text of a Flora, but it does not carry the same initial high printing costs. Indeed, CD-ROMs can be cheaply duplicated in small quantities, to order. Whilst the CD-ROM is less favoured in terms of sales (approximately 1 CDROM sells for every five copies of the book) the advantages of CD-ROM media are clear. It is a much more powerful publishing medium, where understanding accrues from the ability to manipulate large volumes of text, data, graphics and colour photographs. It is also inherently more dynamic whereby periodic updates can be issued to keep pace with the ongoing biological recording effort. Above all CD-ROMs are fun! The Flora of Cornwall CD-ROM is available from the author at the address below, as well as from the NHBS and Summerfield Books. Full price is £40, but those who have already purchased a copy can obtain an upgrade for £20.

Colin French 12 Seton Gardens, Camborne TR14 7JS. cnfrench@zawn.freeserve.co.uk

The Derbyshire Flora Project

Newsletter 31d.jpg

In 1994, botanists in Derbyshire responded to the proposal from the Botanical Society of the British Isles (BSBI) to produce a new national atlas of vascular plants covering the whole of Britain and Ireland. The scheme, called Atlas 2000, was officially launched in 1996 and the county plant recorder and Derby Museum, who run the Derbyshire BRC, agreed to collaborate on collecting data for this project as well as for a new mapped Flora of Derbyshire. We decided that our area of coverage would include both the botanical vice-county (VC57) and the modern county boundary of Derbyshire.

 

Jacob’s Ladder Polemonium caeruleum.Courtesy of Derbyshire Wildlife Trust

The last Flora of Derbyshire appeared over thirty years ago (Clapham 1969), just eight years after the first Atlas of the British Flora (Perring & Walters 1962). By the time it was published, however, many of the records it contained were well over twenty years old. Two supplements were subsequently produced, listing new county and 10km square records. (Patrick & Hollick 1975; Hollick & Patrick 1980), but there were to be no further updates until the publication of a definitive county plant checklist earlier this year as part of the project currently being described (Moyes & Willmot 2002).


 

Data Collection
Procedures were soon established for recording all plant species seen in each Derbyshire hectad (10km2). First of all, a series of Mastercards was created for each of these squares, summarising published records in past Floras, and a team of volunteers was gradually recruited to collect data on a monad basis (1km2) across the county. This level of recording was felt to be essential if data were to be useful both nationally and locally. Further details were requested for all rare or unusual sightings, and next season we shall be treating invasive aliens in a similar way. Although botanists working alone or in pairs collected most data, regular field meetings were arranged to encourage participation, enhance recording skills, and to maintain a sense of group purpose. Wherever possible, one or more hectads were allocated to an individual botanist or local group to co-ordinate. Over the years, around seventy volunteers have become involved in some way in the Atlas 2000 scheme in Derbyshire and in our own ongoing flora project.

Most records were collected on Monks Wood BRC plant cards (RP28), but a separate “Common Plant Card” was printed to encourage non-specialists to take part. This named the 350 most frequently recorded and easily identified county plants, listing them alphabetically by common name. Another card was produced just for high moorland plateau areas of low species diversity that few botanists visit. These allowed hillwalkers to collect data easily from up to five monads per card. In this way Rubus chamaemorus went from being a local Red Data Book plant to “locally frequent” almost overnight. Unlike the Common Plant Card, the High Moorland Card was photocopied in-house, but its paper quality proved insufficient to withstand the rough handling received, so records often arrived in a shoddy state. With hindsight, it might also have been helpful to have encouraged recorders to note down the route taken whilst recording, plus the time take to complete each card.

Volunteers working from home transferred published records from past county floras and supplements onto spreadsheets. These were merged together, then collated for import into the Recorder database at Derby Museum. Other data sources included local Red Data Book records, key species records held by English Nature and Peak Park offices, plus some 15,000 records supplied electronically by local recorders. The latter proved useful, but could easily have contaminated our database with spurious data if not thoroughly checked both before and after import. Further data sets will be sought in the years ahead, including records from our local Wildlife Trust’s site surveys — which have recently been offered to us.

Validation
Upon receipt, the county plant recorder checks all record cards and new 10km2 records marked off on separate Mastercards for each of the 45 county hectads. As most data arrive on RP28 cards, Mastercards were themselves created on these cards, with different elements of a species name being marked off to indicate a different date class. Records and cards are then passed to Derby Museum for full computerisation on a Recorder 3.3 database, aided by a team of volunteers. To speed up data entry, screen pop-ups were created for all cards in regular use, and protocols developed to enable data collators to self-check the records they had just keyed in prior to processing and saving them. Error checking has always been regarded as a high priority at DBRC. Newly entered data are printed off in batches of 10,000 records onto continuous stationery on a wide-carriage dot matrix printer, then matched against each record card for accuracy. In this way, a small but significant number of basic recording and inputting errors were spotted very early on, as well as many minor typos. Major errors of grid reference or species name are corrected immediately; minor ones are corrected later on, then printouts bound and archived. Record cards are photocopied and returned to original recorders if requested. Filing references on every paper and computer record enable us to trace back and validate any record. Plant cards are filed by hectad, then by Recorder’s own record numbers. (Tetrad letters added to each card give an alternative way of filing, should this ever be necessary). Other documents and letters are given simple sequential filing numbers.

Before data were finally submitted to the Atlas 2000 project at the end of 1999, species lists for each hectad were generated from Recorder and matched against those maintained manually by the county recorder. Again, every discrepancy was investigated and traced back to the original document or recorder for clarification or correction. The hardest discrepancies to resolve were species appearing on the paper checklists, but not on the database.

Personalised maps and species lists offered to the most active botanists gave each the opportunity to see just the data they had contributed, plus a chance to query any missing or unusual data. Another useful error-checking routine involves regularly reporting the number of records of each species held on RECORDER. The VC Recorder was encouraged to view and query all species for which less than 5 records were held, and a significant number of simple data inputting errors were traced by this means. All these tasks were labour-intensive, but give great confidence in our data. (Despite this we have already discovered a small number of errors which slipped through to the Atlas 2000 project in the few weeks since the New Atlas has appeared in print, and doubtless others will be found. (Preston, Pearman & Dines 2002). All are being noted, Monks Wood BRC will be informed, and published mistakes will be referred to in the text of our own Flora. Obviously, all these error checking processes have to be repeated at intervals to be continuously effective.

Feedback
Throughout the project, botanists have been kept informed of progress by regular newsletters, annual indoor meetings and, more recently, by website and email. Recorders were, and still are, encouraged to visit areas that are under-recorded. Maps plus lists showing monads and tetrads containing less than 80 or 120 taxa are produced annually, and are also available via a website. All feedback was based on statistics and maps generated from the Recorder 3.3 database. Whilst field recording is still being encouraged today, the emphasis has shifted in the last two years to a combination of filling in under-recorded squares and visiting sites where our group’s recording skills can be of value to others – e.g. National Trust and Wildlife Trust sites. After so much square-bashing, it is rewarding to visit botanically rich sites, perhaps to relocate Red Data Book species. We should probably have done this earlier on. The Derbyshire Biological Records Centre has now computerised half a million plant records at Derby Museum. Of these, over a third of a million were collected by local botanists and amateur naturalists specifically for the Derbyshire Flora 2000 project and for the New Atlas of the British and Irish Flora. Inevitably some data sets will have been missed, or simply not been accessible in the timescale available to us.

County Plant Checklist
Once data had been submitted to the Atlas 2000 project, attention turned towards our prime goal of producing a new Flora of Derbyshire. We immediately identified a need for a definitive list of plants ever recorded in the county to guide us in our work. The production of that Checklist and the decision to publish it then occupied the next 18 months. Recorder 3.3 was used to generate a list of species showing the latest year each taxon had been recorded. Once imported into an Excel spreadsheet, it was then edited by hand to bring it to a publishable form, with a draft circulated to local botanists a year prior to final publication. The Checklist appeared in print in April 2002, as an alphabetic list of all plants ever recorded in the county, with the latest year of recording, county status and conservation status. Two tables provide extra information and involved considerable research. One showed all new or unpublished records made since the last Flora updates appeared in 1980, whilst the second listed all previously published species now considered erroneous or unconfirmed.

The Checklist serves four uses. It guides us in our work; it gives an interim product to local recorders who finally see their data in use; it encourages recorders to report errors or look for species not seen recently, and of course it generates a small amount of income. The printers worked directly from our master Excel and Word files. It cost £410 to produce 350 copies of the Checklist with a coloured card cover, folded to an A5 booklet totalling 50 sides. It sells for £2, and was fully funded by a grant from the Friends of Derby Museum who have allowed us to retain the income for future Flora-related activities. It lists 1,927 plant taxa of which 1,662 are distinct species, and of these 1,329 have been recorded since 1987.

The Way Forward
With the publication of the plant checklist, and the recent publication of the New Atlas of the British & Irish Flora, our efforts are now fully focused towards identifying the best way of physically drafting the Flora itself. Initially, we considered writing flora entries on Recorder — an approach taken by the Flora of Fermanagh project. The practical difficulties of setting this up persuaded us against it, especially as the VC57 Plant Recorder (Dr Alan Willmot) is physically separated from the main data set maintained at Derby Museum. Instead, to help him draft the species accounts, we have decided to create two products for his use: 1) an “Expanded Checklist” 2) a “Frozen Database”.

The “Expanded Checklist” is a fuller version of our published checklist incorporating many statistics generated by Recorder, and updated every six months or so. Figures such as first year, latest year, record counts, square counts of hectads, tetrads and monads etc. can all be readily generated from Recorder 3.3, then merged into one huge spreadsheet. Information from our published checklist was manually incorporated - a long task, only needing to be done once. All irrelevant taxa on Recorder, like “Dryopteris sp.”, were retained in the Expanded Checklist. This will allow us to update the statistical part of the spreadsheet without having to recreate the entire file.

We initially intended to present these data as a mail merge in Word, but finally decided to import the spreadsheet into an Access database, so enabling the VC Recorder to see all relevant statistics and facts in one place, and to draft, store and edit species accounts on screen. Further blank fields will be used to indicate, for example, whether a taxon should be mapped, the required map background, published references, links to relevant images and so on. It should be pointed out that during this process a bug was discovered in the way R3.3 counts squares. A slight over-counting found when comparing statistics with plotted map data, though it is only significant for the rarest species. A fix prepared by Stuart Ball will be distributed with the next Recorder2002 update, and fuller details will soon be available on the NFBR website and Recorder 3 User eGroup.

The “Frozen Database” is essentially a duplicate copy of the main plant data set fixed at one point in time (planned for December 2002). As most key data sets have now been captured, the “frozen database” will not be updated until we near the end of the project, except for major errors that might otherwise affect the accuracy of the species accounts being prepared. These accounts can then be written with full access to all key records. Not knowing if this was the best way forward, and surprised by the lack of ready information or advice on how other Flora Groups operate, the author decided to establish an email group for those involved in Flora projects in the UK. This eGroup is advertised elsewhere in this Newsletter.

Having prepared a range of sample flora entries (based on local Red Data Book plants) we shall then seek feedback from local botanists before continuing. These accounts will then be incorporated into an update of our project proposal for the Flora (first drafted in 1997) and used when seeking funding. But even drafting 100 entries a month would take 2 years alone, and during that time we shall have to identify and cajole chapter authors, identify photographic requirements, determine the best way of presenting our data in DMAP, and find adequate funding for our product, and find a publisher. (A recent BSBI conference on Local Floras advised against seeking major funding more than two years in advance of publication). With all these challenges still ahead, it sometimes seems unwise to set our sights too high on producing a full colour publication of the quality and appearance of the recent Flora of Norfolk (Becket, Bull & Stevenson 1999), but it would equally be a shame to aim too low and not do our county justice. Realistically, we shall need to raise well over £25,000. But with just £500 in the bank so far and 500,000 plant records already on computer, the challenges ahead seem as great as the ones we have already surmounted.

Copies of recording cards created for the Derbyshire Flora Project, plus a sample of the Expanded Checklist and notes on its production can be downloaded from the Files section of the new UK Flora Writers eGroup: www.smartgroups.com/groups/Florawriters

Copies of A Checklist of the Plants of Derbyshire by can be obtained by sending a cheque for £2.50 (include p&p) payable to the “Derbyshire Flora Committee” to the address below.

Nick Moyes
Derbyshire Biological Records Centre, Derby Museum & Art Gallery, The Strand, Derby DE1 1BS.

Flora of Assynt

Flowering Plants and Ferns by P.A. Evans and I.M. Evans
Bryophytes by G.P. Rothero
Published by P.A. & I.M. Evans 2002 284pp ISBN 0-9541813-0-1

Origin and timing The decision to compile a tetrad Flora of Assynt was taken in 1988 by Pat and Ian Evans and record gathering continued for 13 years. In 1992 G.P. Rothero joined the team; he was wholly responsible for the authorship of the Bryophyte section and made a valuable contribution to the records of montane flowering plants. Prior to this, the only flora covering the parish was John Anthony’s Flora of Sutherland, published posthumously in 1976. In this, common species were listed as present/absent for any given parish (there are just 13 in Sutherland) and if the species was scarce, the name of the localities in which it occurred were given. The Botanical Society of the British Isles has information on all vice-counties and the vice-county recorder maintains these files. P.A. Evans is the recorder for West Sutherland and holds the data for the area.

The use of the tetrad as a recording unit (the first time this had been done in the Highlands) was, in the event, fully justified by the number of records of uncommon plants made in hitherto unworked areas. Not many of the West Sutherland records on file were sufficiently well-localised to assign to a tetrad in Assynt and most of these had been made by botanists on holiday in the ‘honeypot’ places. Assynt had no resident botanists until Pat and Ian moved there and almost all the data for the latest survey were gathered by them, although a number of people, both local and visiting, supplied some records. A more substantial contribution was made by a group of BSBI members during a field meeting held in Assynt early on in the project to assist with recording.

Data gathering A common species recording card was devised to suit the local flora and the selection of the species for this list was made using John Anthony’s Flora of Sutherland. It was divided into 4 sections: ferns and fern allies; grasses; sedges and allies; the rest. During a visit to a tetrad these common species were crossed off and given a frequency and habitat. All occurrences of rarer species were listed on the back of the form, each with a six-figure grid reference. The shortness of the season so far north meant that recording was only feasible between late May and the end of September and during this period about three days a week was spent in the field. There are few roads in Assynt and not many footpaths; covering the ground can be very timeconsuming and really distant areas had to be surveyed in ‘prime time’ so that orchids and grasses could be identified on a first, or what might have to be an only, visit. Of the 164 tetrads, a few far-flung marginals were only visited once, most at least twice and some considerably more often. In all, 30,676 records of flowering plants and ferns were made, and, in the much shorter time available to G.P.R., 13,600 records of bryophytes.

Sins of omission Most of these could have been avoided had the project been thought through in more detail from the very beginning. As it was, ‘learning from experience’ played an important part. The survey would have benefited from a study of the details of Assynt’s somewhat complex geology before each field trip. The importance of basic dykes in the gneiss only became apparent as time went on. The inconsistency in the recording of subspecies was regretted when the species accounts were being written and had the historic records of uncommon species been researched earlier, it might have been possible to track more of them down. In other words, my advice is to ‘Look hard before you leap!’

Data management Each winter the records were copied from the field recording sheets to Master cards which were never taken into the field. They were also entered in to Recorder 3.3, using ‘pop-ups’ made to mirror the list of common species on the recording sheets. At the end of September 2000 it was decided not to take in any more records except in very exceptional cases (a new vice-county record at least!). Too much time would have to be spent endlessly processing odd items and updating the relevant maps. Local floras are after all only a snapshot in time. Indeed, ‘new’ records were collected; to be processed after the book had been published.

When all records had been entered into Recorder they were transferred to DMAP 7.0e via PLOT 5 and R2DW. The map of Assynt used in DMAP had hectad lines and roads indicated and the sea blocked in grey. The maps used for bryophyte distribution had, in addition, an ‘X’ covering the tetrads that G.P.R. had not surveyed. As he did not join the team until partway through the project, it was inevitable that the coverage of mosses and liverworts was not complete, nevertheless more than 100 tetrads were visited. Geology was not built in to the distribution maps; it was impracticable at that scale and using colour would have added greatly to the cost. A detailed coloured map of the geology, together with a simplified black and white version, was included in the relevant chapter. The width of the distribution maps for the species account had been determined as 40mm and this was entered into DMAP via File - Page Setup. Each map was Saved As '*.WMF', where * is a recognisable contraction of the scientific name. These files are stored by DMAP in a default folder ‘Output’ from which they were transferred, a batch at a time, to the Clipart gallery.

Layout It had been decided that the book was to be a laminated softback, A4 format, with a wrap-around landscape photograph on the cover. Word 2000 was used in its production. Colour plates of individual species were not included, as well-illustrated books of these are easily obtainable. Coloured landscape photographs of a range of habitats in Assynt were, on the other hand, thought to be an asset and a block of 25 of these was included at two, and in one case three, to a page. They were supplied to the printer as transparencies.

After trying out various page layouts, it was decided that the interests of clarity and economy would both be served by having the introductory chapters set in two columns per page and the species accounts in three. The chapters, covering geology, climate, history of the landscape and vegetation, were simple enough to handle; in most cases each one was made as an individual file. The species accounts were more difficult. Having decided on a typographic pattern which distinguished between the scientific names, synonyms, English names and Gaelic names, the heading for each species was entered, ranged left. The text was then typed in, initially right across the page to make for easier reading and editing. Each contained first and other significant historic records from all available sources back to 1767. The contemporary account was written with the aid of the distribution map and from the authors’ personal experience of the plant in this area, which was possible as they had carried out most of the fieldwork. When a manageable file size (in practice this was between 40 – 50 KB at this stage), had been composed and edited, the file was converted into 3 columns. It was decided not to use justified text, as with such a short line some very strange effects could be created.

Starting with the first species in the first column, a line space was opened up after the names and the relevant map was selected using Word’s Insert - Picture option. This led in to the Clipart gallery, (previously stocked from DMAP, see under Data management), the map was highlighted in there, moved onto the text page and inserted into the space opened up after the names of the plant. Working in this way down the columns required much adjustment of spacing to make sure that a map was always embedded in the relevant text. The files at this stage were between 800 and 2000KB, just about manageable, but Word was sometimes a little slow in moving around when all the maps were in place. Once the columns were fixed, even the smallest alteration in the text could have an effect on the spacings that might extend over more than a page. It was for this reason that it was decided to supply the printer with camera-ready copy. With the everpresent possibility of losing data in mind, copies of the Flora files were always lodged in at least one other house.

Costs A print-run of 750 copies was decided upon, for which we paid the print management firm £9255. They were given camera-ready copy for the whole book, apart from the landscape illustrations that were supplied as transparencies and some of the topographical and geological maps that were scanned or sent on disc. A further £1000 was taken up by accounts for cartographic services, computer advice and assistance, a commissioned frontispiece, fliers, purchase of a (second-hand) laser printer and ISBN registration. The decision to publish the book privately was made to free ourselves from the constraints (and charges) which are inevitable when using professional publishers. The decision has, of course, its downside in the shape of a new challenge. Pre-publication offers, publicity, review copies, booksellers, invoices, bank accounts and endless wrapping up of parcels took up a great deal of time, but was quite satisfying in a non-botanical way and it has not been regretted. The book was marketed through local tourist information centres and booksellers in the north of Scotland who might be expected to sell to visitors. Prepublication fliers went to members of the B.S.B.I. and some other societies and reviews included one by a colleague who writes for a national daily, which generated a number of requests. The B.S.B.I. review will not be out until early in 2003.

We decided on £15 as the retail price, with a discounted price of £10 to booksellers and a pre-publication offer of £12.50. Approximately 120 were sold pre-publication and as of now (the end of September) we have disposed of, in round figures, 450 books. These include complimentary and review copies. When the survey was completed and specimen pages of maps and text were prepared, applications were made for financial assistance to three organisations. Scottish Natural Heritage, who support local research, contributed £5000; the Botanical Society of the British Isles (the authors are active members) £1000 and the Glasgow Natural History Society, who encourage research and publications relating to the west of Scotland, £1000. The lastnamed made an outright grant and the other two do not expect repayment until the authors have covered their own investment.

We have now covered our investment, paid back the Botanical Society of the British Isles (as a registered charity, they are repaid first) and hope soon to be able to start refunding the substantial amount loaned by Scottish Natural Heritage. The book is available from mail order and other booksellers and can be ordered quoting the ISBN given at the beginning of this article. It may also be purchased from the publishers P.A. & I.M. Evans, Calltuinn, Nedd, Drumbeg, Sutherland IV27 4NN for £15 plus £4.50 postage and packing. Cheques payable to ‘Flora of Assynt’.

Pat Evans

UK Flora Writers egroup

Anyone involved in producing a regional flora, checklist, or interactive CD should be interested to learn of a new email group recently established. The UK Flora Writers egroup was set up following a recent conference on Local Floras organised by the Botanical Society of the British Isles (BSBI). It aims to encourage communication and exchange of ideas between people who might otherwise be working in isolation. It came as a surprise to some that the BSBI did not actively co-ordinate the efforts of county flora groups.

The eGroup aims to be a self-help group, whereby an email sent by a member is automatically forwarded to all other group members, as are any replies. It is hoped that individuals who have recently produced their own county floras will consider joining. They will undoubtedly have much skill and experience to offer the rest of us. In addition, the group’s homepage has a growing set of resources such as links to online herbaria, atlases and floras, plus downloadable files with contributions from Arthur Chater, Geoffrey Halliday and Martin Sanford, amongst others. Joining is easy, free, and open to anyone with an interest in local floras. Go to www.smartgroups.com/groups/Florawriters and follow the simple on-screen instructions.

Alternatively send a blank email to Florawriters-subscribe@smartgroups.com. Note that joining by email will not enable you to access files or links on the homepage until you have also registered with Smartgroups – a simple and free process.

 

Recorder 3 Users egroup

A self-help support group for users of Recorder 3.x software was launched in November by the NFBR. Though now superceded by Recorder 2000 and 2002, this programme is still used by a large number of individuals and biological records centres across the UK. Aware that formal support for Recorder 3.x is no longer available, NFBR is keen to help users help each other by sharing experiences, problems and solutions in using this package. The NFBR cannot itself give advice through these discussion groups, though many of its members will undoubtedly be willing to help others. Users should always ensure they have adequate copies of their data at all times. Further information on Recorder 3 and a number of FAQs written by Stuart Ball and others will eventually be available both on this egroup’s homepage and from our own NFBR website. Joining the Recorder 3 Users eGroup is easy. It is currently free and open to anyone involved in using this software. It is not exclusive to NFBR members.

Go to www.smartgroups.com/groups/RECORDER3 and follow the simple on-screen instructions. Alternatively send a blank email to Recorder3-subscribe@smartgroups.com. Note that joining by email will not enable you to access files or links on the homepage until you have also registered with Smartgroups – a simple and free process. As with most other eGroups, off-topic messages may be removed and spammers barred.

Nick Moyes

 

A New Atlas of the British & Irish Flora

Preston, C.D. Pearman, D.A. and Dines, T.D.

Oxford University Press 2002, 912pp, 2412 colour distribution maps and CD ROM

Based on over 9 million records, and including the results of BSBI Monitoring Scheme and the recent Atlas 2000 survey, the long awaited Atlas was published in September 2002. All native species are covered, together with a wide range of subspecies. Pre-1987 and pre-1970 records are shown, so that the changes in range and frequency of many species are demonstrated. The text describes the habitat of each plant; summarises changes in species' distribution; includes the date of introduction of alien species; outlines its European and wider distribution and provides key reference for further reading.

The records and text for the mapped species, plus over 940 additional rare aliens, are summarised on an accompanying CD. This enables users to view and print distribution maps, captions and associated data tables and manipulate the data to produce species lists, additional maps and add overlays containing environmental information.

The RRP for this huge volume with CD-ROM is £99.95 but will be permanently available at the reduced price of £70 plus postage from BSBI Publications,Summerfield Books.

Tel 017683 41577 www.summerfieldbooks.com

 

Vice-County Digitising Project

Over the last year or so, the National Biodiversity Network (NBN) has been running a pilot exercise to investigate the costs and practicalities of producing a fully digitised version of the Watsonian Vice-County maps. Anyone working with VC recorders or their data will appreciate both the value of unchanging boundaries, but also the practical difficulties of determining exactly where the real boundary should be when working in the field. The two available maps covering the UK and published by the Ray Society in 1969 give only a rudimentary outline of each county, and are useless for detailed work. It is often necessary to go to a local library or archive office and view maps of the period and then transfer them to modern OS maps in order to get grid co-ordinates.

The pilot is being managed by Charlie Copp, whilst digitisation is being done through Landmark, with the work itself undertaken in India. The initial results of this pilot have only just come in, and we are told that further time is needed to make a full appraisal. The project will then move on to a costing exercise which may show that the whole thing is too expensive; that the original specification needs to be revised; that full digitisation should go ahead. In the latter case, funding will then have to be found. There must be many naturalists reading this who would welcome seeing and using the results of this project, and possibly the odd Local Records Centres, too! So watch this space.

Nick Moyes (VC57).