dsfsdfsd

Transcription

dsfsdfsd
BARCODE OF LIFE DATA SYSTEMS
BOLDSYSTEMS.org
Handbook
O c to b e r 2008
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
Table of C on t e n t s
BOLD Handbook
1. Introduction
2. BOLD General System Map
3. Signing up for BOLD
4. Taxonomy Browser
5. BOLD Search
6. Create a BOLD Project
7. Submission Protocols
a) Data Submission
b) Image Submission
c) Trace Submission
d) Sequence Submission
8. BOLD Project Summary
..........................................1
..........................................2
..........................................3
..........................................4
..........................................5
..........................................6
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1. I n t rod uc tio n
The Barcode of Life Data System (BOLD) is an informatics workbench aiding the acquisition, storage, analysis and
publication of DNA barcode records. By assembling molecular, morphological and distributional data, it bridges a
traditional bioinformatics chasm. BOLD is freely available to any researcher with interests in DNA barcoding. By
providing specialized services, it aids the assembly of records that meet the standards needed to gain BARCODE
designation in the global sequence databases. Because of its web-based delivery and flexible data security model,
it is also well positioned to support projects that involve broad research alliances.
This handbook provides details on how to sign up for BOLD and create a project. It also explains how to upload
specimen data, images, traces and sequences to your project on BOLD.
Figure 1-1: The front page of BOLD.
1
BOLDSY
BO
BOLDSYSTEMS.org
LD
DSY
SYST
SYST
STEM
EMS.
EM
MS.
S or
org
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
2. BO L D Ge n er a l S y st em M ap
www.barcodinglife.org
Manual
Input
Request
an account
Document
Action
Tax Browser
BOLD
Tutorial
BOLD
Taxon
Search
Data
Image
All Barcodes
Specimen ID
Trace
Browse
Hierarchy
Data
Species
Level
Specimen ID
Image
ITS Database Identification
Reference
Specimen ID
Trace
Species
Barcoded
Report
Download
Sequences
COI Database Identification
ID Specimen
Analysis
Templates
Manuals
Documentation
Downloadable
Data
BOLD Handbook
BOLDSYSTEMS
Legend
Viewable
Data
Species
Page
Specimen
ID
Taxon
ID Tree
Project Management from Project Console or Record Listing Page
Sequence Analysis
Downloads
Sequences
Published
Projects*
Taxon ID Tree
Data
Spreadsheets
Nearest
Neighbour
Distance
Summary
Distribution
Map
Spec Age vs
Seq Length
Specimen Labels
Sequence
Composition
Traces
Specimen Aggregates
Image
Library
*The published projects are
also accessible when a user
is signed in to the private
projects workspace
These functions are only available from the private project console
Private
Projects*
(log-in)
Uploads
Specimen Data
Primers
Publication
View All
Primers
Submit to
Genbank
Register
Primers
Project
Summary
Images
Create
New
Project
Search
All
Records
Traces
Sequences
BOLDSYSTEMS.org
2
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
BOLD Handbook
3. Si g ni ng u p fo r B O LD
Getting an account on BOLD allows you to upload your
data into a private workspace and take advantage of the
integrated analytical tools.
On the BOLD main page (www.boldsystems.org) click
on either one of the two links: ‘Request an Account’ or
‘Request a new user account’. These links will take you to
the New User Application Form.
(http://www.boldsystems.org/views/newuserapp.php)
Click on ‘Submit Request’ to send your application to
BOLD. An introductory e-mail will be sent to you with the
information you need to log in and begin using BOLD.
Once you have an account you can login via the main page
to access your private workspace.Your next step will be to
create a project. Please see page 4 for instructions.
Valid Email Address
Use a current institutional email.
First Name
Fill in your first name, first letter should
be capitalized
Middle Initial
Fill in middle initial(s) if needed,
capitalized
Last Name
Fill in your last name, first letter should
be capitalized
Institutional
Affiliation
Select the name of your institution
Add New
Institution
If your institution is not listed, click on
button to register it
Password
Should be at least 5 characters
Table 3-1: Information required to create a new user
account on BOLD.
Figure 3-1: New user account creation on BOLD.
3
BOLDSY
BO
BOLDSYSTEMS.org
LD
DSY
SYST
SYST
STEM
EMS.
EM
MS.
S or
org
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
4. BO L D Ta x o n o my B row ser
BOLD Handbook
The taxonomy browser allows
users to examine the progress
of DNA barcoding, and to
browse different levels of the
taxonomic hierarchy. Animals,
Plants, Fungi, and Protists are
being barcoded and the user
can browse through each
kingdom from phylum down
to the species level.
Figure 4-1: The BOLD taxonomy browser.
Lineage
Lists the higher taxonomic levels.
Graphic Displays
of:
Specimen Records The number of specimen records.
Specimens with
Barcodes
The number of barcoded specimens.
Public Sequences
The number of public sequences and a
link to download them.
List of Species
Barcoded
A list of all species with records on
BOLD. The number of specimens, the
number of sequences and the number of
sequences greater than 500bp are listed.
Link Outs
Links to several community partners
pages for that specimen
Lower Taxonomy
Links to all lower classifications
» the total number of barcodes and
reference barcodes.
» quantity of species barcodes and those
used as reference barcodes.
» the institutions where the specimens
are deposited.
» a map of the world highlighting specimen
collection locations.
» a graph showing the frequency of
specimens/barcodes against age.
» a list of countries where specimens
were collected, including the number of
specimens from each country.
» various images of specimens within that
taxonomic group.
Table 4-1: Information available at each taxonomic level within the BOLD taxonomy browser.
BOLDSYSTEMS.org
4
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
BOLD Handbook
5. B O L D Se arc h
On the BOLD project list page, select the ‘Search All Records’
link on the top left hand side.There are two types of searches for
BOLD: Basic Search and Advanced Search.
Taxonomy
Searches the taxonomic names on BOLD.
There is a text field for search terms that
should be either included or excluded
from the search
Geography Country/Province
Searches the country and province
names on BOLD. There is text field
for search terms that should be either
included or excluded from the search
Geography Region
Searches region names on BOLD. A
text field for search terms that should be
included in the search.
Sequence Length
Text fields for each of the minimum and
maximum number of base pairs.
Specimen/Sample
ID
Searches the taxonomic names on
BOLD. More information about the drilldown menus and how they work would
go here, pending content from Megan.
Searches sample IDs and process IDs on
BOLD. There is also the option of pasting
a list of sample or process IDs from a
spreadsheet (link to the right).
Include GenBank
Data
When checked, the search includes
GenBank records on BOLD.
Country/FAO and State/Province:
Searches the country and province
names on BOLD
Single
Representative
per Species
When checked, the search will only display
one representative per species found.
Note that any search criteria containing spaces such as Species
names, country names that consist of more than one word, and
sample ID’s with spaces should be wrapped in double quotes (eg
“United States” or “Drosophila melanogaster”). The Paste from
Spreadsheet function allows you to paste a column of sample IDs
or process IDs from a spreadsheet and will automatically place
quotes around search criteria that require them.
Taxonomy
Geography
Table 5-1: Explanation of the terms used within the
Basic BOLD search functions.
Table 5-2: Explanation of the terms used within the
Advanced BOLD search functions.
Figure 5-1: The BOLD search engine, showing both basic and advanced search functions.
5
BOLDSY
BO
BOLDSYSTEMS.org
LD
DSY
SYST
SYST
STEM
EMS.
EM
MS.
S or
org
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
6. C re a t i n g a n ew p ro je c t
BOLD Handbook
Once logged into BOLD, select the ‘Create New Project’ link on the top left hand side of the project list
page. It will take you to the New Project Submission Form. The following pieces of information need to be
entered in order to create the project:
Project Title
Please create a descriptive name
Project Code
A 3-5 letter code. It needs to be unique
across BOLD
Project Type
Choose between the following options:
• Data Project (contains specimen &
sequence records)
• Folder Project (contains other
projects)
Primary Marker
Select your primary marker. CO1 is the
default.
• Cytochrome Oxidase Subunit 1 5’
• Region Interspacer (ITS) Region
Campaign
Select the name of the campaign the
project is part of or ‘None (General
Project)’ if it is not part of a campaign.
Place in
container
Select the name of the Folder Project or
‘Independent Project’ if it does not belong
into a folder project.
Project
Description
A brief summary of the use and intention
of the project.
Project Manager The person who creates a project is
automatically the project manager, and has
full specimen and sequence access.
Assign Users
Other BOLD users can be added to a
project. Different levels of access are
possible:
• Sequence Access: Analyze,View, and
Edit Sequences
• Specimen Access: Edit Specimens
Table 6-1: Required information for BOLD project
creation.
Figure 6-1: The BOLD new project submission form.
Sequence Access permissions consist of three levels. With Analyze permission, the user can perform analysis on the
data, but cannot view more than a summary of the data (sequence and related information remain hidden). With View
permission, the user can view or download the sequence data. With Edit permission, the user can upload sequences
or make changes to existing sequence features.
Specimen Access permission allows the user control over sample identifiers, taxonomy, collection data, and images of
the specimen: this level is intended for project managers, collectors, and taxonomists only.
To submit your entries to BOLD, click ‘Save’ at the bottom of the form.
Please note that the person who creates a project is automatically the project manager of that project. The
project manager has full access to the project and can assign other users to the project.
The project manager can change any details or add/remove users, by simply clicking on ‘Modify Project Properties’ in the upper left corner of the project.
BOLDSYSTEMS.org
6
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
D a t a S u b m i s s i o n P ro to c o l
7a) D a t a Su b m is sio n P ro t o c ol
This protocol assists in the submission of bulk data to BOLD.This is the easiest way to populate your project with records,
as well as the only way to enter new species taxonomy into the BOLD library. Described below is the necessary format of
the data that is required for a correct submission.
Whenever a bulk submission is sent to the data manager(boldsyst@uoguelph.ca), the following pieces of information need
to be sent in the body of the emai, with the standard submission spreadsheet attached:
I.
II.
III.
IV.
V.
Project title
Project code
Project manager
Priority Level (High, Intermediate or Low)
Submission type (New Records or Update)*
* If type is ‘Update’: Please specify which worksheets (Voucher Info, Taxonomy, Specimen Details, or Collection Data) need to be updated. See page 7 for more information.
The data spreadsheet consists of 4 worksheets, a main specimen identifier worksheet (voucher info) that is linked to three
other worksheets: taxonomy, specimen details, and collection data. (Refer to Tables 1 through 4 for field definitions)
Sampple ID *
ID associated with the sampl
p e beingg
sequenced (often an extension of field
or Museum ID).
Reproduction Sexual/asexual/cyclic parthenogen only.
Fiel
Fi
eld
d ID *
Specim
Spec
imen
en iide
dent
ntifi
ifieerr fr
from
om a ppri
riva
vate
te
collllection
i or Field
ld number
b ffrom a
collection event.
Life Stage
Adult/immature only.
Extra Info
Muse
Mu
seum
um IID
D*
Catalo
Cata
logg nu
numb
mber
er iinn cu
cura
rate
ted
d co
collllec
ecti
tion
on
for a vouchered specimen.
User Specified Characteristics (free text) Can be displayed on a tree or used to sort
records. Limited to a maximum of 50 characters. Designate FAO region here.
Notes
Collection Code
Code associated with given collection.
Institution Storing *
Full name of the institution where
spec
sp
ecim
imen
en iiss vo
vouc
uche
here
red
d.
Free text or XML tagged text. All XML text
should be surrounded by the XML start
(<xml>) and stop (<xml>) tags.
Sample Donor
Full name of individual responsible for
providing specimen or tissue sample.
Donor E-mail
E-mail of the sample donor.
Table 7a-1: Field definitions for Voucher info page on
accompanying spreadsheed.
Sex
Male/female/hermaphrodite only.
Table 7a-3: Field definitions for Specimen Details page
on accompanying spreadsheet.
Collectors
Comma delimited list of collectors.
Collection Date
Date of collection, must be in MM-DDYYYY format.
Continent/Ocean ISO Continents and Oceans.
Full Taxonomy
Full taxonomy consisting of phylum*,
class, order, family, subfamily (optional),
genus, species binomial.
Identifier
Full name of primary individual responsible for providing taxonomic identification of the specimen.
Identifier E-mail
E-mail address of the primary identifier.
Identifier Institution
Institution of the identifier.
Country
ISO Countries.
State/Province
States and provinces (according to Getty
Geographical Thesaurus).
Region
Park, county, district, lake or river.
Sector
Sector of park or county/city.
Exact Site
Description of collection location.
GPS Coordinates Latitude & Longitude in “degrees.decimal
degrees” format (e.g. 45.837).
Elevation/Depth
Table 7a-2: Field definitions for Taxonomy page on
accompanying spreadsheet.
Table 7a-4: Field definitions for Collection Data page
on accompanying spreadsheet.
* Minimum required fields for new records.
7
Elevation or depth in meters.
BOLDSY
BO
BOLDSYSTEMS.org
LD
DSY
SYST
SYST
STEM
EMS.
EM
MS.
S or
org
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
Dat a Su bmi s s io n - E xa m p le s
D a t a S u b m i s s i o n P ro to c o l
Here is an example of a properly filled in data submission. You can get this blank template in two ways:
• From the info CD that came along with your sampling units from the CCDB
• Online by clicking on “Specimen Data” under the Uploads menu – the sheet is available through the link at the
top
Use the tabs at the bottom of the workbook to navigate through the four pages.
All of the data in BOLD is organized by projects. There is a limit of 1000 entries for a given project, to keep the size
manageable. Related projects can be grouped into containers. An individual entry in the database represents a barcode
of a given specimen. The Process ID uniquely represents a specimen in BOLD. This is the identifier that is used to track
a specimen through the barcoding process: collection, taxonomic identification, sequencing, analysis and final publication
of data. Process ID is assigned internally when a specimen record is created.
Specimen data can be entered in one of two ways. As outlined here, for larger sets of samples, the data can be entered
on the Data Submission Template spreadsheet and sent to BOLD. Data managers will review the data, to ensure that it
meets the minimum requirements, and input it to BOLD. For smaller numbers of entries, (ie: 1-10 records) users can
enter sample data through the web interface by clicking on “Specimen Data” under the Uploads menu and using the
manual interface there.
Sample ID
Field ID
Museum voucher ID
Sample-demo01
Sample-demo01
Sample-demo02
Sample-demo02 15466-JUC-ISC
Sample-demo03
Sample-demo03
Specimen Info
Collection Code
Institution Storing
BIO
Joe Smith
jsmith@BIO.org
ISC
ROM
Joe Smith
jsmith@BIO.org
BIO
Joe Smith
jsmith@BIO.org
Sample Donor
Donor Email
Figure 7a-1: Example data for Specimen Info
Taxonomy
Order
Family
Subfamily
Genus
Species
Identifier
Identifier Email
Identifier
Institution
Sample-demo01 Arthropoda Insecta
Diptera
Asilidae
Hydropsychinae
Efferia
Efferia
aestuans
Joe Smith
jsmith@BIO.org
Oxford
Sample-demo02 Arthropoda Insecta
Diptera
Asilidae
Joe Smith
jsmith@BIO.org
Oxford
Sample-demo03 Arthropoda Insecta
Diptera
Joe Smith
jsmith@BIO.org
Oxford
Sample ID
Phylum
Class
Asilus
Figure 7a-2: Example data for Taxonomy
Reproduction
Specimen Details
Life Stage
Extra Info
Sample-demo01 Female
Sexual
Adult
Sample-demo02 Male
Sexual
Adult
Sample-demo03 Male
Sexual
Adult
Sample ID
Sex
Notes
Commonly called ‘Robber Fly’
feeding on fruit
Figure 7a-3: Example data for Specimen Details
Sample ID
Collection Info
Collection Continent
State /
Exact
Collectors
Country
Region Sector
Date
/ Ocean
Province
Site
Sample-demo01 Joe Smith
27-Jul-07
Sample-demo02 Joe Smith
27-Jul-07
Sample-demo03 Joe Smith
5-Sept-07
Asia
Central
America
Japan
Latitude
Longitude
Izarigawa,
42.878 141.572
Eniwa
Hokkaido
Japan
Hokkaido Soya
Costa
Rica
Guanacaste
ACG
Elevation
45
44.671 142.788
Mundo
Neuvo
10.772 -85.434
305
Figure 7a-4: Example data for Collection Info
BOLDSYSTEMS.org
8
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
D a t a S u b m i s s i o n P ro to c o l
Dat a Sub mi ssio n - Ty p es
There are two types of submissions: “New Submission” and “Update”.
A new submission is what is done every time new records are added to a project. Update submissions are for modifying records that
already exist in a project. If you wish to only update one or two records, please manually select the specimen from the species record
listing in your project and clicking on the “edit” button in the upper right corner. Any details can be edited in this way, except for adding
new taxonomy to BOLD. If there is new taxonomy to add to the BOLD library this should be sent in as an update.
New Submission
Update Submission
New submissions are project specific, so that their data can be
associated with a project on BOLD. If records are submitted that
need to be entered into different projects on BOLD, a separate
file for each project needs to be sent.
The quickest way to update data is to download the Data Spreadsheet from BOLD containing the records that need to be modified. To do so, click on “Data Spreadsheets” from the Downloads
menu on the upper left side of your project. Only download the
worksheets and records that will be affected by the update (e.g. if
the taxonomy needs to be updated only download the Taxonomy
worksheet, if specimen details and collection date need to be
update only download the Specimen Details and Collection Data
worksheets, etc.). Once the worksheets are downloaded, modify
the data and copy it into the standard submission spreadsheet.
The submitted update should reflect what the data should be on
BOLD. Please send this on to the data manager.
The minimal requirements for a new submission on BOLD are:
• Voucher Info Page - Sample ID
• Voucher Info Page - Field ID and/or Museum voucher ID
• Voucher Info Page - Institution Storing
• Taxonomy Page - Phylum
Other useful information:
It is important to use a unique and original format for the sample
IDs. If the sample IDs provided are not original to BOLD, they
will need to be changed before the data can go online.
Provide as much detail and additional information as possible
with a new submission. That way, it will take less time later to
update the blanks.
Only the following characters may be used in the sample ID, field
ID, and museum ID: Numbers, letters, and ^ . : - _ ( ) #
All other characters will be removed.
If the specimen has sex, reproduction or life stage values that
do not fit the accepted values for Specimen Details, then please
move the information to the Extra Info or Notes fields.
NOTE: Any fields left empty will be considered blank and thus
removed from BOLD during an update. Do not remove any
data from the update sheet if you’d like it to stay on BOLD. The
computer cannot distinguish between “blank: do not update this
field’ or “blank: delete the content of this field”.
Updates to Voucher Info are slightly different from updates to
Taxonomy, Specimen Details, and Collection Data.
a.) Updates to Voucher Info
Identical to new submissions, updates to the voucher info are
project specific. The records need to be split into their corresponding project.
In the case where the donor or identifier is deceased or retired,
please make note of that in the email field. This is important to
provide this information so we can keep the database up-todate.
b.) Updates to Taxonomy, Specimen Details, and Collection Data
Updates to taxonomy, specimen details, and collection data are
project independent. Records from any number of projects can
be submitted in one submission spreadsheet, and the number of
records are (in theory) infinite for this type of update.
If the submission is part of a campaign, iBOL Working Group, or
a checklist, please let us know in the submission email.
Please see the previous page for an example of the filled in
spreadsheet.
9
BOLDSY
BO
BOLDSYSTEMS.org
LD
DSY
SYST
SYST
STEM
EMS.
EM
MS.
S or
org
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
7b) Image S u b m is sio n P ro t o co l
Image File *
Complete (incl. extension) and identical file
name (case sensitive) of images.
Original
Specimen *
Enter yes if the image shows the actual specimen for this record. Otherwise enter no.
View
Metadata *
A short tag describing the orientation of the
image that will appear on BOLD.
Caption
Additional information about the image.
Copyright info or descriptions are recommended.
Measurement
Measurement that was taken (including the
unit of measurement.
Measurement
Type
Item that was measured (e.g. body length,
wing span, etc.)
The recommended steps are oulined below:
Sample ID *
Sample ID for photographed specimen, must
match Sample.
1. Collect Images:
Collect high-quality images of specimens in .jpg format for your
project. BOLD accepts high resolution images up (up to 20
megapixels) but only displays a greatly reduced thumnail. Your
high resolution image is archived but will not be used without
the submitter’s consent. Refer to the following page for a guide
on picture orientation and quality.
Process ID *
Process ID for photographed specimen, must
match Process ID in BOLD.
The image submission package for BOLD is a zip file containing a
set of images and an Excel spreadsheet that associates the necessary data with each image. There must be a row in the spreadsheet for each image uploaded and the required columns must
be filled in (See Table 1). A template spreadsheet can be downloaded from the BOLD site (www.boldsystems.org/dsfsdfsd)
2. Assemble Package:
The image submission package should consist of all images (.jpg)
and a spreadsheet with the file names and ancillary data. Make
sure that all images in the package are accounted for in the
spreadsheet. When submitting more than one image per specimen simply copy the ‘Sample ID’ and ‘Process ID’ to the next line
with the file name of the consecutive image. You can upload 1 to
10 images per specimen, depending on organism characteristics.
Please photograph several different orientations if needed.
The submission spreadsheet should be named ImageData.xls and
contains the columns described in Table 1.
Image File
Original
Specimen
View
Metadata
Caption
I m a g e S u b m i s s i o n P ro to c o l
This protocol outlines the image submission process on BOLD.
It describes the necessary format of the images and the ancillary
data, and the steps required to build the uploadable package required for a successful submission.
Table 7b-1: Field definitions for accompanying
spreadsheet.
* Required Fields
Steps:
A. Fill in the ImageData.xls data sheet with all the data related
to the images in the submission package.
To create the list of image files in a folder, open a terminal window (Start > Run > cmd in Windows), navigate to the folder
containing the image files, and then run one of the following
commands:
Windows
MacOS
Linux/Unix
dir /b *.jpg>list.txt
ls *.jpg*.JPG>list.txt
ls *.jpg*.JPG>list.txt
These commands will generate a list of all the files in the current folder and save it in list.txt. You can then open list.txt in
move the data into the Image File column.
Measurement
Measurement
Type
Sample Id
ROM 10912
Process Id
ROM101912-D.JPG
yes
Dorsal
skull
15 mm
skull length
BM272-03
ROM101912-L.JPG
yes
Lateral
lower jaw
7 mm
length
ROM 10912
BM272-03
ROM101912-L2.JPG
yes
Lateral
skull
15 mm
skull length
ROM 10912
BM272-03
ROM101912-V.JPG
yes
Ventral
skull
15 mm
skull length
ROM 10912
BM272-03
ROM101912-D2.JPG
yes
Dorsal
skin
50 mm
dody length
ROM 10912
BM272-03
ROM101912-V2.JPG
yes
Ventral
skin
50 mm
body length
ROM 10912
BM272-03
ROM101944-D.JPG
no
Dorsal
skull
17 mm
skull length
ROM 10944
BM278-03
Figure 7b-1: Image Submission Spreadsheet (ImageData.xls) completed with sample data.
BOLDSYSTEMS.org
10
I m a g e S u b m i s s i o n P ro to c o l
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
B. These two components (Image files and Spreadsheet) need to be placed in a single folder. Compress them all into a single file
before submitting. The following free tools are available to provide this functionality:
» WinZip - http://www.winzip.com
» WinRar - http://www.rarsoft.com
» MacZipIt - http://www.maczipit.com
C. BOLD will accept a maximum file size of 195 MB. Upload the images to BOLD by clicking on the link Specimen Images in the
Uploads menu of the desired project. Select the zipped folder of images and then hit “submit”.
I mage Su bm is sio n - T ip s a n d Tro u bl esh o o ti n g
This section describes the most commonly-encountered image upload problems.
• Zipped file must be under 195MB in size. If the upload fails to initialize, the zipped file
may be too large. Break it into two uploads, each with its own spreadsheet.
• The spreadsheet can not contain any formulas.
• If the upload program can not find the image files, it is possibly because it can not read
•
•
•
•
•
11
the names. Make sure that the spreadsheet contains text values only.
Full filenames must be used in excel sheet. The extension (.jpg) must be included in the
image file name. The file extension is case sensitive.
Spreadsheet must be named ImageData.xls. If the upload program can not find the excel
sheet, confirm that it is named correctly (case sensitive).
Max of 30 characters in the free text fields of the excel sheet. Verify that the data length
in these fields and make adjustments if necessary
Data must start on the second line of the spreadsheet. There is only one line for the
column headers.
Adding extra columns to the sheet will cause errors.
BOLDSY
BO
BOLDSYSTEMS.org
LD
DSY
SYST
SYST
STEM
EMS.
EM
MS.
S or
org
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
Phot og raphy G u id e
All images should be in landscape orientation, with a 2x3
aspect ratio. If your specimens do not easily fit these
criteria please try to keep them in a standardized position.,
as this makes it much easier to compare specimens within
a project. If desired, a measurement scale may be included
in the image to provide a size reference.
Figure 7b-2: Suggested sample photographs.
Dorsal
Lateral
Dorsal
• The anterior of the specimen should be facing the top of the
image frame
• The specimen should be face-down, with the dorsal aspect of
the head visible
I m a g e S u b m i s s i o n P ro to c o l
Please take pictures using the high quality mode on your
camera. The specimen should be centered in the image
frame. Photos should be taken as close-up as possible, leaving
very little gap around the edges. The following standard
orientations should be adhered to when appropriate.
Lateral
• The anterior of the specimen should be facing the left side of
the image frame
• The specimen should be oriented with the feet towards the
bottom of the image
Ventral
Ventral
• The anterior of the specimen should be facing the top of the
image frame
• The specimen should be face-up, with the ventral aspect of the
head visible
BOLDSYSTEMS.org
12
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
Tra c e F i l e S u b m i s s i o n P ro to c o l
7c) Tra c e Fi le S u b m issio n P ro to co l
This protocol assists in the submission of trace files to BOLD. It
describes the necessary format of the files and the ancillary data
that is required for the correct submission.
1. Register Primers:
Please see the next page for details on how to register primers.
2. Assemble Package:
The submission package consists of trace files (.ab1), corresponding phred files (.phd.1) and a spreadsheet with the file names
and ancillary data. The submission spreadsheet should be named
data.xls and contain the columns described to the right.
Trace File *
Complete (incl. extension) and identical
file name (case sensitive).
Score File
Complete (incl. extension) and identical
file name (case sensitive).
PCR Primers
Fwd/Rev *
Primer codes are case sensitive.
Sequence Primer
Primer codes are case sensitive.
Read Direction *
Forward or Reverse.
Process ID *
Process Id of specimen, must match
Process Id in BOLD.
Table 7c-1: Field definitions for accompanying
spreadsheet.
* Required Fields
Steps:
A. Fill in the data.xls sheet with all the data about your files.
To create the list of the files in a folder, you need to open a terminal window (Start > Run > cmd in Windows), navigate to the
folder where the trace and score files have been placed and then
run one set of the following commands:
Windows dir /b *.ab1>ab1.txt
MacOS
ls *.ab1>ab1.txt
Linux/Unix ls *.ab1>ab1.txt
and
and
and
dir /b *.phd.1 >phd.txt
ls *.phd.1 > phd.txt
ls *.phd.1 > phd.txt
These commands will generate lists of all the files in the current
folder and save it ab1.txt and phd.txt. You can then open the text
files and move the data into the appropriate columns.
B. These components (Trace files, Score files and Spreadsheet) need to by placed in a single folder. Compress them all
into a single file before submitting. The following free tools are
available to provide this functionality:
» WinZip - http://www.winzip.com
» WinRar - http://www.rarsoft.com
» MacZipIt - http://www.maczipit.com
C. BOLD will accept a maximum file size of 195MB. Upload
the images to BOLD by clicking on the link “Trace Files” in the
Uploads panel of the desired project. Select the zipped folder
of files and then hit “submit”.
PCR
Fwd
KKBNA001-04_H01.ab1 KKBNA001-04_H01.phd.1 BirdF1
KKBNA001-04r_H07.ab1 KKBNA001-04r_H07.phd.1 BirdF1
PCR
Rev
BirdR1
BirdR1
BirdR1
BirdR1
Forward
Reverse
KKBNA001-04
KKBNA001-04
KKBNA002-04_G01.ab1
BirdF1
BirdR1
BirdR1
Forward
KKBNA002-04
KKBNA002-04r_G07.ab1 KKBNA002-04r_G07.phd.1 BirdF1
BirdR1
BirdR1
Reverse
KKBNA002-04
KKBNA003-04_F01.ab1
KKBNA003-04r_F07.ab1
KKBNA004-04_E01.ab1
BirdR1
BirdR1
BirdR1
BirdR1
BirdR1
BirdR1
Forward
Reverse
Forward
KKBNA003-04
KKBNA003-04
KKBNA004-04
Trace File
Score File
KKBNA002-04_G01.phd.1
KKBNA003-04_F01.phd.1
KKBNA003-04r_F07.phd.1
KKBNA004-04_E01.phd.1
BirdF1
BirdF1
BirdF1
Seq Primer
Figure 7c-1: Trace File Submission Spreadsheet (data.xls) completed with sample data.
13
BOLDSY
BO
BOLDSYSTEMS.org
LD
DSY
SYST
SYST
STEM
EMS.
EM
MS.
S or
org
Read Direction
Process Id
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
Tra c e F i l e S u b m i s s i o n P ro to c o l
Tra c e Fi l e - P r im e r Reg is t r ati o n
Be sure that your primer codes are
registered with BOLD before assembling the submission package. To register your primers, select “Register Primers” from the Project Options menu in
your project on BOLD.
On the form, you are asked to fill in
the following information:
Figure 7c-2: BOLD Primer submission form
Primer Code
Create a code for your primer. If the
primer is already published in a manuscript,
please use the code that is in press.
Direction
Select the direction
Fill in references and/or citations
Primer
Description
This field is for filling in a description of
what the primer is used for.
Reference/
Citation
Notes
Notes about the primer
Alias Codes
Fill in any other known code names for
your primer, separated by commas
Publicly Available
Target Marker
Select the target marker from the
controlled list of markers (e.g. ITS, COI
5’, matK, etc.)
Primer Sequence
Fill in the sequence, 5’ to 3’
If the primer has already been published,
or if you wish to make it publicly available,
this should be left public
If the primer you used has already been
registered under a different name, you
will be provided with the registered code
to be used in your submission.
Table 7c-2: Field definitions for accompanying figure.
BOLDSYSTEMS.org
14
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
Tra c e F i l e S u b m i s s i o n P ro to c o l
Tra c e Fi l e S u b m issio n - T ip s an d Tro u bl esh o o ti n g
This section describes the most commonly encountered trace file upload problems.
• Primers must be registered before upload. If the
primers are not registered, there will be an error.
Please refer to the previous page for details on
how to register primers.
• Zipped file must be under 195MB in size. If the
upload fails to initialize, it is probably because the
zipped file is too large. Try breaking it into two
uploads, each with its own spreadsheet.
• The spreadsheet cannot contain any formulas.
• If the upload program can not find the files, it
is possibly because it can not read the names.
Make sure that you have text values only in the
spreadsheet.
• Full filenames must be used in excel sheet. The
extension (.ab1, .phd.1) must be included in the file
name. These extensions are case sensitive.
• Spreadsheet must be named data.xls. If the upload
program can not find the excel sheet, confirm that
it is named correctly (case sensitive).
• Data must start on the second line of the
spreadsheet. There is only one line for the column
headers.
• Do not add extra columns to the spreadsheet.
• Trace files will not be downloadable from BOLD
until 24 hours after they have been submitted.
Figure 7c-3: A list of public primers available from the project
console. These are helpful for those who are new to barcoding.
Figure 7c-4: Trace file for Vulpes vulpes (red fox).
15
BOLDSY
BO
BOLDSYSTEMS.org
LD
DSY
SYST
SYST
STEM
EMS.
EM
MS.
S or
org
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
This protocol outlines the sequence file submission process on BOLD. It describes the necessary format of the
sequences, and the steps required for a successful submission.
1. Assemble Package:
The sequence submission should consist of
sequences in fasta format referenced by BOLD
Process IDs.
2. Upload Package:
You can put up to 1000 sequences into one upload. Upload the sequences to BOLD by clicking
on the link “Sequences” in the Uploads menu of
the desired project. Paste the sequences into the
text box and hit “submit”.
Figure 7d-1: Pop-up window for uploading traces
» If you wish to replace a sequence on BOLD, simply upload the new one with the same Process ID.
» If you wish to delete a sequence on BOLD, simply upload “NNNNN” associated with the process ID.
Example:
>TZBNA001-05
CTGCAGGANCAAAAAATGAAGTATTTAAATTTCGATCTGTTAATAATATAGTAATAGCTCCTGCTAATACAGGTAAAGATAATAATAATAAAAAAGCTGTAATTCCTACAGCTCAAACGAAAAGGGGTAGTTGATCGAAAAATATATTATTTAATCGTATATTAATAATAGTTGTAATAAAATTAATTGCTCCTAAAATAGAAGAA
>TZBNA002-05
CAGCTAATACGGGTAAAGATAATAATAATAAAAAAGCTGTAATTCCTACTGCCCAAACAAAAAGAGGTAATTGATCAAAAAATATATTATTTAAGCGTATATTAATAATAGTTGTAATAAAATTAATTGCCCCTAAAATAGAAGAAATTCCTGCTAAATGAAGAGAAAAAATAGCTAAATCTACAGAACTACCCCCATGGGCGATATTAGAAGATAATGGGGGGTAGACTGTTCATCCTGTT
>TZBNA012-05
AAAATAGCTAAATCAACTGAGCTTCCTCCATGAGCAATATTAGATGATAGTGGGGGGTAAACTGTTCATCCTGTTCCAGCTCCATTTTCTACCACTCTTCTTGAAATTAAAAGAGTAATAGAAGGGGGGAGTAATCAAAATCTTATATTATTTATTCGTGGGAAAGCN
Figure 7d-2: Illustrative barcode for Homo sapiens (human).
BOLDSYSTEMS.org
16
S e q u e n c e S u b m i s s i o n P ro to c o l
7d) Se qu e n c e S u b m issio n P ro to co l
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
BOLD Handbook
8. BO L D C o n so le
Once your project has been populated with the data,
images, traces and sequences that you have uploaded to
BOLD, it will look like the figures on the right. For further
information on how to navigate a project, please refer to
the description below.
Project Console
The console shows you a report of the amount of specimens,
along with tallies of any missing components of the records. The
console includes graphs to provide a quick visual overview of
the project, as well as a list of all the users on the project. The
links to the left provide access to uploads, downloads and various
analysis tools. The record listing can be accessed by clicking on
“View All Records” under the Project Data Views menu in the
upper left corner.
Record List
The record list gives access to the individual specimen and
sequence data for each record. You can select specific records
for analysis or updates using the checkboxes. Icons will appear
next to a record to indicate the presence of certain aspects of
a record.
Figure 8-1: BOLD Project Console
GPS coordinates present for sample
Images present for sample
The number of traces present
Stop codons present in sequence
Contamination present in sequence
Flagged record, not in ID engine
Table 8-1: BOLD Record List icons
Click on the Sample ID or the Process ID to access the Specimen
Data and Sequence Data respectively, for each record
Specimen Window
This window provides voucher details, taxonomy, specimen
details and collection data, along with a world map of where
the specimen was collected. The images for the specimen are
located at the bottom of the window. To edit any details, simply
select “Edit” from the upper right corner.
Figure 8-2: BOLD Record List
Sequence Window
The sequence page gives access to various details about the trace
files and sequences for the specimen. Trace files can be viewed
or downloaded from this window. If desired, the ID engine can
be used to identify the sequence.
Near the bottom of the page is an illustrative barcode of
the species, along with a link to the Laboratory Information
Management System (LIMS) for the Canadian Centre for DNA
Barcoding.
17
Figure 8-3: Specimen Data
BOLDSY
BO
BOLDSYSTEMS.org
LD
DSY
SYST
SYST
STEM
EMS.
EM
MS.
S or
org
Figure 8-4: Sequence Data
B A R C O D E
O F
L I F E
D A T A
S Y S T E M S
N ote s
BOLDSYSTEMS.org
18
Last modified: Oct 2008
BOLDSYSTEMS.org
Biodiversit
ersit y I nstitute o
off O
Ontario
ntario
U n i ve r s i t y o f G u e l p h
579 G o rdon Street
Gue l p h , On t ari o, Ca n a d a
N1 G 2 W 1
Co py r i ght ©2008 B io diver sit y I nst i t u te of Onta rio