This blue whale (Balaenoptera musculus) was photographed from the air as it surfaced off the coast of Redondo Beach (near Los Angeles, California) to exhale and take a new breath, before diving underwater to feed on krill.
| Blue whale, exhaling as it surfaces from a dive, aerial photo. The blue whale is the largest animal ever to have lived on Earth, exceeding 100′ in length and 200 tons in weight.
Image ID: 25953
Species: Blue whale, Balaenoptera musculus
Location: Redondo Beach, California, USA
| Blue whale swims at the surface of the ocean in this aerial photograph. The blue whale is the largest animal ever to have lived on Earth, exceeding 100′ in length and 200 tons in weight.
Image ID: 25952
Species: Blue whale, Balaenoptera musculus
Location: Redondo Beach, California, USA
I recorded the GPS position (latitude, longitude) each time I took a photo of a blue whale. Curiously, the blue whales remained in a small area directly over the submarine canyon that lies offshore of Redondo Beach, as seen in the below screen shot from Google Earth. My hunch is that the krill upon which the blue whales were presumably feeding was gathered in, or near, the canyon. You can click the image below to bring up the Google Earth display, showing the images superimposed where they were photographed above the Redondo Beach submarine canyon.
Keywords: blue whale, aerial photo, Balaenoptera musculus
Update 9/28/2009: see my report and latest photos from Bishop Creek and Rock Creek (cool slideshow) as well as a summary of other links on the web about fall colors.
OK, last post about Fall Color in the Eastern Sierra. I am looking forward to getting up there and am optimistic this year will be a good one for turning aspens (Populus tremuloides). A couple photographers whom I follow have already remarked they have their reservations already. I was going through some of my favorite aspen shots from a few years ago, reminding myself where I shot them so I can be sure to revisit some of the same spots again. I put them on Google Earth (all of my images are geotagged so they feed automatically into Google Earth). If you have Google Earth installed, you can click either of these links. What **should** happen is that Google Earth should launch and soon after the 18 images will appear superimposed where they were taken in Bishop Creek Canyon. You can click any of the tiny thumbnails in Google Earth to see the image large along with captions.
I am now offering a new service: I will physically walk around with your camera and shoot the photos for you, process them and email you the best ones. It’s a win-win situation: you don’t need to ask your boss if you can ditch work in the middle of the week, you don’t need to make any tiresome hikes in the thin, cold, clean mountain air, you won’t make that long drive up 395 which cuts down on pollution, and the area is less crowded for me.
OK, that part was a lie, but in lieu of that showing you where I stood to take some nice photos is the best I can do. Enjoy.
Bishop Creek Canyon Fall Color on Google Earth (click to launch this map in Google Earth)
I also recently posted some links where one can soon see reports about Fall Color in the Eastern Sierra. I don’t have reports to offer (I live in San Diego so its hard to just get up there for a day or two this time of year) but there are many talented California photographers who do share detailed and timely reports.
Shameless plug: I’ve got a nice collection of fall color photos. Check them out, they really are pretty good if I do say so myself. (Heck, when the colors are peaking its hard to take a bad photo of turning aspens.)
Keywords: fall color, eastern sierra, photo, picture, image, aspen, populus tremuloides, bishop creek canyon, google earth, geocoding, geotagging.
I’ve been a fan of Wordpress for almost six years. I started blogging with Wordpress in 2005 and have upgraded several times over the years as new versions have come out. However, it may be time for me to say TTFN to Wordpress. I am usually very slow to upgrade any software I use, preferring to let others discover bugs first and waiting for a few maintenance releases before upgrading. However, after reading that Wordpress 2.8.3 contains several security fixes, on Friday I made the jump from 2.7 to 2.8.3. As soon as I did so problems arose, bringing my server to its knees and causing a variety of crashes. After two days studying the situation, reading up on recent Wordpress forums, making some configuration changes on the server on which my website is hosted and adding some diagnostic code to the Wordpress source itself, I am finally in a position to say with some confidence: Wordpress 2.8.3 is seriously bloated.
Triple the Memory == Bloat
It appears to me that recent programming changes in Wordpress have caused it to use quite a bit more memory than it did before. It could be that one change is responsible for much of the bloat, or that many small changes are each responsible for their own small shares of the problem.
Let me give an example to illustrate how significant the issue is. When a browser requests my most recent blog entry about humpback whale pictures, a PHP memory allocation on my web server of only 2.8 MB is required when using Wordpress 2.7. However, if Wordpress 2.8.3 is installed the required allocation balloons to 10.1 MB. That’s a factor of 3.6, or 260% more memory. (See below under “Instrumenting …” for how you can determine the memory allocation of your own Wordpress installation, its quite simple.) Visitors to my site don’t see ANY difference with the blog content yet by virtue of upgrading to 2.8.3 the web server was forced to allocate more than triple the memory. That’s crazy.
To see how widespread this problem is, Google “Wordpress + 2.8 + Fatal + Error” or keywords like that. You’ll find that many people have reported problems with memory allocation failures occurring on their web servers after upgrading to Wordpress 2.8. Now, here’s the curious part. The common wisdom in these online discussions seems to be to allow PHP (the scripting language upon which Wordpress is built) to use more memory. For instance, a typical installation of PHP and Wordpress might allow each instance of Wordpress serving up a page to use, say, no more the 32 MB of memory. The Wordpress community seems to recommend raising this limit to 64 MB to make the memory allocation failure problems of Wordpress 2.8.3 go away. Raising this limit can be done in a few ways: 1) by modifying wp-settings.php so that WP_MEMORY_LIMIT is initialized to 64M rather than 32M, and/or 2) by setting “php_value memory_limit 64M” in your .htaccess file.
As I searched the net on this issue I saw some comments essentially saying “This is just a server configuration issue, increase the memory to 64M”. Hello? Just a server issue? No, this is in fact a coding issue. If one requires 64 MB to display a page of text with a few widgets around it, something is wrong.
Simply put, increasing PHP’s memory limit is a short-term fix and is sidestepping the true problem, which is code bloat and inefficient programming. Allocating 64 MB to serve up a blog page is overkill, in fact 32 MB is more than enough. Realistically, a lean and mean blog should only require a few MB to do its thing. Think about it, how much memory does it take to query an entry from a MySQL database, run some ancillary functions to surround it with headers, widgets, titles, blogrolls, etc., format it and then present it to the output stream? If your Wordpress pages require more than 32 MB to serve, you had better have very few visitors, a powerful server or an expectation that you will periodically crash on your visitors. If you allow PHP to use up to 64 MB to serve up a single blog page, and you suddenly have a rush of visitors (say 10 at a time), you need plenty of memory on your server, or at least plenty of burst memory, or you stand a chance of have a failure of some kind that one or more of your visitors will see.
My hunch is that prior to Wordpress 2.8, few Wordpress users had to fiddle with the memory limit.
My weblogs say I get about 5000-6000 unique visitors to my site each day, in addition to search engine crawlers. (There are fewer in the summer when students aren’t raiding photos for their reports). There are times when 100+ simultaneous instances of the httpd daemon are running on my server. Each page view (not just each visitor) invokes a new instance of the Wordpress software which, using 2.8.3 on my installation, would need 10 MB just to serve up the page. My server currently has about 400MB of real memory available, and up to 1 GB in “burst situations” (its a virtual private server). Using Wordpress 2.7 I rarely observed occasions when too many simultaneous page views caused an out-of-memory situation. In fact I figure it would take roughly 125 simultaneous views to use up the real memory and 300 or so to exhaust the burst memory. However, immediately after I installed Wordpress 2.8.3 on Friday I began to see PHP fatal errors and server daemons (services) dying due to lack of memory. They continued to occur at least every hour, at times when lots of people hit the site simultaneously. At 10 MB per page view using Wordpress 2.8.3, it would only take 40 simultaneous views (including search engines which are constantly pounding my site) to eat up the physical ram and start causing problems for the server.
After installing 2.8.3 and encountering problems, I first tried raising the WP_MEMORY_LIMIT and PHP memory limits to 64M as recommended by other Wordpress users, but the problem continued. I’ve made lots of custom tweaks to my installation of Wordpress, and am not afraid to fiddle with the code to get things to work better. So I set out to make some changes to the code in an effort to figure out what was going wrong.
To learn how much memory was being used by PHP to serve up a blog page, I added a few well placed calls to the PHP functions memory_get_usage() and memory_get_peak_usage(). memory_get_usage() reveals how much memory is in use at the moment the function is called, while memory_get_peak_usage() shows the maximum amount of memory required up to the point the function is called. In the index.php source file in the main Wordpress directory, I added the line
just before the final “?>”. This produced an additional line at the bottom of the blog, showing the peak memory use (in bytes) required by PHP to run the Wordpress scripts that generated the page being viewed.
(Note: calling this simple line of code instrumentation may be a bit rich since it is nothing more than one line of debug code. I actually instrumented wp-settings.php, adding memory checks before most of the require and require_once statements. Doing this allowed me to watch the memory accumulation ratchet up as each require statement is executed. I did not see a single obvious point where the memory ballooned, rather the accumulation was steady across all the require statements.)
It was in this way that I learned that Wordpress 2.8.3 was using over 10 MB to serve up a single blog entry in my installation. This was the default installation of 2.8.3; the only change I made was to the theme (I use classic) and one plug-in. I tried uninstalling the only plug-in (WP-Geo) but it made virtually no difference. It appears the bloat is in the Wordpress code itself. I made the same one-line addition to index.php in my 2.7 Wordpress installation and was pleasantly surprised to see how little memory was required, only 2.8 MB. The solution was obvious: revert to my prior Wordpress version. Since I had kept a copy of the entire Wordpress directory structure on my server before upgrading to 2.8.3, reverting to 2.7 was just a matter of renaming directories and running a few tests.
I wrote the code for the non-blog part of my stock photography web site to use between 1 and 3 MB of memory for any given page. For instance, this page of bald eagle photos requires about 2.5 MB of PHP memory to load. I’m comfortable with that. But 10 MB, which is what Wordpress 2.8.3 was requiring for a simple blog page, is simply too much. So I have reverted back to Wordpress 2.7 for now, and am keeping my fingers crossed that the talented Wordpress development community can make some improvements in the memory usage. My sense is that the community development for Wordpress focuses on adding “features” and little energy is devoted to improving existing code so that it operates more efficiently. However, if enough users experience the bloat problems that I have seen, Wordpress developers may take the issue seriously. We’ll see what happens. I am hopeful.
I’m also doubling the amount of memory on my server. Just to be safe.
A few related links:
Blue Anvil Journal » Blog Archive » Wordpress 2.8 Memory Usage
Seven Reasons Why Wordpress 2.8 Is Better Than Ever | Clint Maher
Allyn Gibson · On WordPress Woes
Google Search on “Wordpress” + “2.8″ + “64M” finds hundreds of comments on this problem.
Keywords: Wordpress, memory usage, upgrade, bloat, blog, software, version, server, PHP, fatal error, allocation failure.
I recently made a banzai road trip run up to Bodie State Historic Park, near Mono Lake, to make some photos. There is some danger that certain state parks in California may close soon due to budget cuts. I’ve wanted to see the Bodie ghost town for a while and figured with the danger of the park being closed soon I better get on with it. So I made a reservation a few weeks in advance for the Saturday sunrise photographer’s access and decided to give it a try. I got to the park gate about 30 minutes before sunrise, and to my delight found that only two other photographers were there that morning. Solitude, at sunrise, in one of the finest ghost towns in the country. (OK, granted, Bodie is not technically a “ghost town”, but I think of it that way.) I basically had the entire town to myself and did not bump into another person for at least an hour. It was quiet, cool, with clear skies and dew on the grass. I spent about two hours wandering around, peering around into the old homes, barns, shops and town halls. I had my pocket GPS recording the whole time, sitting inside my pocket. I shot wide (is there any other way?): 1DsIII with 16-35 and 1DsII with 24-70, along with my uber-mikro-mega-infrared-digikam for some infrared grab shots in black and white.
After returning home it was time caption my shots. I was very glad I had geocoded them! (See my post “How to Geocode Your Photos” for more info.) I had no inkling what the names of any of the buildings I had photographed were, with the exception of a few obvious images (e.g., gas station, car, church). However, since I had geocoded the images, I was able to display them on Google Earth with a simple click, along with a track showing my wandering path through the ghost town. (You will need Google Earth installed to display these links. Click the image below and it will load in Google Earth, displaying a selection of images and my foot path. You can then zoom to your heart’s delight.)
To identify my images, I simply compared the locations in the above Google Earth display (with each image shown exactly where it was taken in town) against this handy map identifying all the major building in town:
I had the two applications appearing on side-by-side monitors, zoomed in a little so the images in Google Earth did not overlap too much. In just a few minutes I was able to match up all of my shots with the correct names of the buildings portrayed in the photographs. Geocoding these saved me a LOT of time. Granted, I could have spent time logging notes while I was shooting, and referring to them later. However, I don’t keep a log. Of anything. So that approach doesn’t work for me. (I don’t log my diving, my running, my swimming, nothing. You don’t log your sex, do you? Then why would you log your photography?)
I just let the GPS do it for me, saves me a lot of time.
See more of my Bodie State Historic Park photos.
Metadata (n, pl): data about data. Any questions?
Recently, there has been a lot of discussion in photography circles about metadata: what is it, how to manage it, what is it good for, etc. Some of the photographers I follow in the blogosphere and more recently on Twitter have interesting things to say on the matter (look to the right for links to some of these guys). I decided to offer some comments about how I use metadata, in the hope these might be useful to other photographers. Who the hell am I and why do my comments matter, you wonder? Good question. I do not have much of a profile among photographers, which is somewhat intentional, but I do have a website that does well with the one search engine that really matters. By way of introduction, here is a short bio about me and about how my website developed over the last 11 years. During that time I have learned how to leverage photographic metadata on a photography website (at least search engines seem to like my site) and am willing to share some of what I have learned. As an aside, other than maintain a website I do no marketing whatsoever, nor do I send out submissions anymore. All of my licensing activity comes either because a client contacted me via my website, or through a couple of old-fashioned photographer-representative-type agencies I am with. Revenues stemming from my website outnumber the agency revenues about 8:1. I attribute this to the effective use of metadata on my website.
If your goal is to develop a stock photography website that shows up in search engine results, metadata about your photographs is crucial. Text, in particular metadata accompanying photos, is all that search engines are able to grab and hold on to as they try to index and spider a website. If your site displays beautiful images with little metadata to accompany them, your site stands a good chance of not appearing in meaningful search engine results. Except for specialized search engines that index image data directly (e.g., Tineye), search engines use the textual information on your site when evaluating it. This goes for images too — search engines will consider the text associated with an image when trying to categorize an image. If you have organized that text information well, and made sure it includes meaningful metadata about the image(s) that are displayed on that web page, that image or page at least has the potential to show up well in search results.
In my workflow there are three types of metadata that I am concerned with:
- EXIF: shooting parameters, recorded by the camera
- GEO: geographic data, if I am geocoding the images
- IPTC: user-supplied information, describing characteristics and business matters related to the image or me.
Following is a description of my photography workflow, from the time the images are downloaded to a computer until my website is updated to include the most recent images. The percentages are the relative time it takes for each step, not including the selections, editing and Photoshop work which take place at the very beginning and which are independent of the metadata side of things.
Step 1: EXIF, The Default Image Metadata (5%)
First I edit the shoot down to keepers. Typically, each keeper is a pair of files: one raw and one “master”. The raw file automatically contains EXIF data about the shooting parameters, copyright information, etc. The master file, usually a 16-bit TIFF or high quality JPEG that is a descendent of the raw file having been processed in a raw converter and or Photoshop, contains the EXIF data as well. At this point nothing special has been done about metadata. The EXIF metadata that is already in the images was placed there by my camera, requiring no work on my part and is what I consider “default metadata”.
I back up my RAW keepers at this point. They have not been touched by any digital management or geocoding software; they are right out of the camera. These go on a harddisk and on DVD disks, and are set aside for safe keeping in case the RAW file is somehow corrupted later in my workflow. It has not happened to me yet, knock on wood, but one never knows…
Step 2: Geographic Metadata, Geocoding (optional) (5%)
If I have geographic location data, it is added now. I often geocode my images, which is the process of associating GPS information, e.g., latitude, longitude and altitude, with the image. I use a small handheld GPS to record the locations as I shoot, and these locations are added to the images by a geocoding program. Conceptually, geocoding gives the image some additional value, since it is now associated with a particular place at a particular time. Sometimes the accuracy of this geocoding is as tight as 20′ (6m). It usually just takes a few minutes to launch the geocoding application, point it to the images and the GPS data, and have it do its thing.
Having GEO data in the image, and later in the database that drives my website, allows me to do some interesting things with my images and blog posts, such as presenting them with Google Earth at the location where they were shot. For example, this photo of the Wave in the North Coyote Buttes is geocoded, and can be viewed in Google Earth by clicking the little blue globe icon. The same goes for most of the blog posts I have: they can be viewed in Google Earth at the right place on the planet. Here is another example. If you have Google Earth installed on your computer, you should be able to click on both of the next two links, which will open into Google Earth. One will display a track and the other will overlay photos, both from a recent aerial shoot around San Diego:
Yes, somewhat crude, but we are in the early days of geocoding and there will be more interesting things in the future we can do.
I’ve written a fairly lengthy post describing how I geocode images: How To Geocode Your Photos. At present, I use a free application named “GPicSync” to add GEO data into each image. This application will update the EXIF information in my RAW and master images to include latitude, longitude and altitude.
A bit of opinion: my belief is that having GEO data associated with your image, on your website, is almost certainly a good thing. Even if no person ever looks at it, there are new technologies coming online constantly that look for, index, spider, collate and retrieve images and web pages based on their GEO data. Those images and web pages that are lacking in GEO data will not see any of the advantages that these new technologies offer. I admit I am no expert on this, and the entire geocoding world along with the entities out there that are indexing geocoded webpages, is all rather new to me. However, I am certain that there will be visitors to my site, and probably already have been many, that arrive as a result of the GEO data that is present alongside my images and blog posts. Having the GEO data embedded in the metadata of the photograph is the first step in this process.
Step 3: Import Images into Digital Asset Management Software (5%)
I import the keeper images, both RAW and master, into Expression Media, which is the software I use for “digital asset management” (whee, yet another acronym buzzword: DAM). I’m no fan of Microsoft, but I do like Expression Media and am used to it (I formerly used its predessor, IView). In particular, Expression Media allows programs (scripts) to be written in Visual Basic. The scripting feature alone is worth its weight in gold as I will point out in the last step of my workflow, and is what makes my processing of images so automated now. I’ve written a dozen or so scripts. It’s quite easy. I have had no training, and have never read any manual for the software. I just based my scripts on examples I’ve found on the internet from other Expression Media users, modifying them to meet my own workflow needs. They carry out mundane tasks and really speed the process up, for example:
- Set baseline IPTC metadata, including copyright notice, name, address, email, website.
- Set baseline “quality”, based on the camera model information in the EXIF. In this way I can rank certain images higher on the website if they shot on a better camera, other factors being equal. I normally don’t want images shot with a point and shoot to appear before those shot with a 1DsIII. I’ve come up with a baseline ranking scheme to differentiate the following image sources relative to one another in terms of typically quality (not in this order however): Canon 1DsIII, 1DsII, 1DIIn, 5D, 50D, 30D, Nikon D100, Panasonic Lumix LX3, LX2, Nikon Coolscan LS5000, LS4000, various drumscans. I can easily fine tune this later for individual images, increasing or decreasing the “quality” of each image so that certain images appear first when a user views a selection of photos.
- Determine the aspect ratio (3:2, 4:3, 16:9, custom) and orientation (horizontal, vertical, square, panorama) of the master image, which may be different than that of the raw image(s) from which it is sourced. This is important for cropped images and for panoramas and/or HDR images assembled from multiple raw files. The script recognizes the multiple raw files that are used to generate a single master file.
At this point my images have EXIF metadata, perhaps containing GEO data if a geocoding step was performed, and basic IPTC metadata that identify the image as mine, how to reach me, etc. So far all I have done is run some applications and scripts. I really haven’t done any “manual” keywording or captioning yet. If necessary, the images are now ready to place on the web, since they have a minimal set of metadata in them that at least establishes them as mine (DMCA anyone?). However, the most important step is to come.
Step 4: Keywording and Captioning (80%)
It’s time to add captions, titles, keywords, categories, etc. to the image. With my new images already imported in Expression Media, and already containing full EXIF metadata and baseline IPTC metadata, I am ready to begin.
- Captions. There is no shortcut for this. Each image needs a decent caption. It is common to group images and assign the same caption to all of them, and then fine tune captions on individual images as needed. The notion of a “template” can be used too, and lots of different DAM applications support this. Whatever application you use to caption your images, there is no alternative but to get your hands dirty and learn how to do it, what approach works best for you. A key concept is to caption well the first time, so you don’t feel a need to return in the future and add more.
- Keywords (open vocabulary descriptors). In general, the same notion as captioning applies here. However, DAM applications often have special support for keywords, allowing you to draw keywords from a huge database of alternatives, facilitating the use of synonyms, concepts, etc. Expression Media allows the use of custom “vocabularies”. A vocabulary is basically a dictionary. For animal images, I developed a custom vocabulary/dictionary of 26,000 species, including most bird and mammalian species, with complete hierarchical taxonomic detail. So, when keywording, I simply type in the latin (scientific) name for a group of images (all of the same species) and up pops a taxonomic record in the vocabulary, showing kingdom, phylum, family, genus, species, etc and a bunch of important scientific-gobbledygook for the species. Hit return and bingo, all the images I have highlighted are all keyworded with appropriate taxonomic metadata. Similar ideas work for locations. I do not do much keywording for “concepts” (e.g., love, strength, relationships, childhood) since I do not pursue that sort of thematic stock, there is enough of that in the RF and micro stock industries already. Here is a list of keywords I currently have among my images.
- Categories (closed vocabulary descriptors). This is the third area of captioning that I find important. Images in my stock files are typically assigned one or more “categories”, and these categories are stored in the metadata of the image alongside captions and keywords. Some examples are: Location > Protected Threatened And Significant Places > National Parks > Olympic National Park (Washington) > Sol Duc Falls and Subject > Technique > Aerial Photo > Blue Whale Aerial. Here is a stocklist of categories I currently have among my images.
- Custom Fields for the website. I have a few other metadata fields that are seen by website visitors that I set via Expression Media scripts. For example, once the captions are created, a script can be used to create “titles” for a group of images, which are really just excerpts of the full captions and can be used for HTML titles, headers, etc. For the most part, these additional metadata fields are secondary in importance to the captions, keywords and categories.
- Custom Fields for Business Purposes. In addition, I use some metadata fields for recording characteristics of the image that I need to track for business reasons. These include licensing restrictions, past uses that affect exclusivity, etc. These metadata are embedded in the image so they are sure to travel with the image as it moves to a client, but they are not presented to the public on the web site.
Note that I consider keywords to be “open vocabulary”, in the sense that any keyword can be used with an image. In other words, I don’t hesitate to add keywords that I have not yet used, its an open set and grows as needed. This is especially true of synonyms, but one doesn’t want to get too carried away with synonyms or it can dilute the search results that a web visitor sees. I often add keywords to images that are already in my stock files at a later date. However, I treat categories as “closed vocabulary”, in that I have a relatively fixed set of hierarchical categories. I will introduce a new category when it makes sense, but usually only when there is a sufficiently large group of images to which it applies, and there is not already a similar category in use.
Once all the metadata for the keepers in my latest shoot are defined in Expression Media, they need to be written out to the images themselves. In other words, Expression Media is aware of these things, but if one were to open one of the images (RAW or master) in Photoshop the new metadata would not be there. This last step in Expression Media is referred to as “syncing” the annotations. (”Annotations” is Expression Media’s word for metadata. I guess “metadata” is scary to people.) I highlight all the files for which I have been adding metadata, then Action -> Sync Annotations -> Export Annotations To Original Files and click “OK”. All the metadata is now stored in the images themselves, and will flow into any derivative images that are created, such as the thumbnails and watermarked JPGs that go onto my web site. (Think DMCA!).
Step 5. Downsteam, or, “Go Forth My Minions” (5-10%)
If I have defined the metadata once there is no need to do it ever again. The metadata, which is now contained in the DAM application but also in the header of each image, “flows downstream” with no further effort. For my purposes, “downsteam” can mean a submission of selects sent to a client, or a submission of images to an agency, or an update of my website.
Downsteam to Clients
There is not much to say here. Best practices in delivering images to clients include using metadata properly. If you are sending out images to clients, or to stock agencies (the old-fashioned kind that actually represent their photographers) or to, for shame for shame, stock portals (RF, micro, they are all evil), then you should have rich, accurate metadata embedded in your image. It is the only way to ensure that the information travels with the image. I’ve received submission requests from potential clients who simply wanted JPGs submitted as email attachments, with the proviso that if a JPG did not have caption and credit embedded in the metadata it would be immediately discarded without consideration.
Downstream to the Web
For many photographers, the final step in processing a new shoot is to update one’s website. In other words, get the new images along with all their metadata (captions, keywords, GEO locations, categories, etc.) onto the web so that they can be seen by the entire world.
For photographers who are using a “gallery” of some kind to host their web site (such as Smugmug, Flickr, PBase, or any of the freely available installable gallery softwares, etc.), simply uploading the images into a new (or existing) gallery is usually all that is necessary. Provided you have managed your metadata in step 4 properly, the metadata will be present in the headers of your new images. As these images are uploaded to the gallery, the gallery software peeks into the header of each image for metadata and, if it is found, extracts the metadata and prepares it for display alongside the image. The details of what metadata are used (caption, keywords, location, GEO, name, copyright, restrictions, EXIF, etc.) differ somewhat from one gallery provider to another, but the general idea is the same.
However, see the final notes at the end of this post for a few caveats about how gallery software may alter your metadata as it processes your image.
My situation is conceptually the same. My website software is essentially a “gallery” including a pretty extensive search feature. However, the software was hand written by me and does not extract metadata from image files automatically like the big-boy galleries do. (Perhaps someday I’ll figure out how to do that.) As I described a few days ago, my web site evolved to be written entirely in PHP and MySql. Underneath the website there is a database that contains information about all 25000 images in my collection. Basically, this database **is** the metadata for my images, or a summarization of those metadata. The database has one record per image. Each record stores the metadata for that image: caption, keywords, image name, location, GEO data, categories, orientation, etc. etc. That said, the issue for me is: how to create this database? The gallery software in the previous paragraph does this automatically, but my home-brewed web software does not.
The beauty of using Expression Media for DAM in my workflow is that with a single click, Expression Media can create this database for me. (Although I have not used other DAM applications, I am sure they are similar.) Expression Media has a few ways of doing this. I could use Expression Media’s built in export functions (Make -> Text Data File or Make -> XML Data File). But after doing this for a while I decided to write a BASIC script within Expression Media that creates the database while doing some fine tuning and error checking on the metadata fields as it does so. Either way, if I use a script of my own or Expression Media’s built-in export features, the database is easily created. Then it is simply a matter of uploading the database along with the images when it is time for a website update.
The point here is that once the work is done in the DAM application, it should be a very quick process to upload the images and metadata to the web and get the images out there for the world to see. Then, if all goes well, the phone rings.
After all that work defining the metadata for your images, and ensuring that it is embedded properly in each image, you would think you are home free, right? Well, there are a few provisos you should know.
Metadata Can Be Stripped By Gallery Software
Some stock portals, gallery hosting services, or install-yourself gallery software (usually written in PHP) will strip metadata from an image. That’s right, they will strip it right out of your image! Why? They claim the reason is to shrink the JPGs that are displayed on the web, in an effort to reduce bandwidth. While this is true, it is a big mistake in my opinion, and is one of the principal reasons I am not involved in any of the stock portal sites or popular photo hosting services. I want my metadata to stay with the image wherever it goes, to all derivative versions of the image. The few extra bytes of storage required for this are trivial compared to the importance of this data being preserved. Think DMCA! Think Orphan Works!
Metadata Can Be Stripped By A Thief
When a thief, or some unwitting schoolkid, makes a copy of your image off the web, the chances are quite good the metadata will be stripped. If the image is taken via a screen shot, the metadata will disappear. If the thief/kid uses “right-click and Save As”, the metadata should remain in the image. But in the end, if the thief/kid alters the image in Photoshop and uses “Save For Web” to save a new copy, the metadata will probably be stripped out. (Yes, Save For Web can optionally preserve metadata, but it is easy to configure Photoshop so that it strips metadata from the image in “Save For Web”, and older versions of Photoshop do not offer the option to override this.)
Too Much Metadata Can Be Displayed
The photo hosting sites seem to display the EXIF fields (shooting data) of your photo’s metadata. This may or may not be what you want. Among hobbyists there is little concern about making the date, time of day, and technique (ISO, shutter speed, aperature) known. Indeed, it is one of the ways that we learn, by understanding what others have done. But often pros have good reason to keep this information to themselves. So, the caveat here is: if you are using a photo hosting service and you don’t want the EXIF data in your image available on the web, you may need to take steps to prevent it.
Here are a couple of new aerial photos from our recently updated collection of San Diego Photos.
| New Point Loma Lighthouse, situated on the tip of Point Loma Peninsula, marks the entrance to San Diego Bay. The lighthouse rises 70′ and was built in 1891 to replace the “old” Point Loma Lighthouse which was often shrouded in fog. San Diego, California, USA.
Location: San Diego, California, USA
View this Image in Google Earth!
| Downtown San Diego and Petco Park, viewed from the southeast. San Diego, California, USA.
Location: San Diego, California, USA
View this Image in Google Earth!
Recently I had a chance to go flying with Ron Niebrugge, an Alaska-based photographer who spends a lot of time shooting in California in winter months. We had a picture-perfect flight, just super, and managed to fly over most of the landmarks we had on our wish list. If you use Google Earth, you can see the flight track we had as well as a sampling of the images I took positioned where they were taken by clicking both of the following two links (you’ll need Google Earth installed for this to work):
I managed a lot of keepers and will be posting some of them in the coming days.
In the course of geotagging some old photos from Catalina Island, I realized it was simple to display the popular Catalina Island dive locations in Google Earth, Google Maps and Live Search Maps. So here they are. The map below is a small version of the Google Earth display, click on it to launch it in Google Earth and be able to zoom in much closer. You’ll need Google Earth installed on your computer for this to work. While it looks best in Google Earth, you can also view it in Google Maps and Live Search Maps, neither of which requires you to install any software to use.
Map of Catalina Island Dive Sites in Google Earth, Google Maps, or Live Search Maps
To see all of our photos keyworded with both “Catalina” and “Island” in Google Earth, click here: Catalina Island photos.
Recently I’ve had some correspondance with other photographers about geotagging, what it is and how I am using it. I was encouraged to put my remarks on my blog. While I do not pretend to be an expert, I am happy to share what I am doing — my workflow if you will. I’ll probably revise this post as I give the matter further thought.
GEOTAGGING DIGITAL PHOTOS is the process of tagging (i.e., merging, joining) digital photos with information about the location where they were taken. So, geotagging (v) is a process in which digital photos are modified. Geotagged (adj) describes a digital photograph as having location information embedded in it. Wikipedia has a good article about geotagging.
1. Digital Photos and Metadata. Digital photos exist as computer files. Two common file types (formats) are JPEG, TIFF, but there are many others. Many of these file formats store not only the image itself (the pixels) but also metadata about the image. Metadata means “data about data”. In this case, the primary data are the pixels in your digital image, and the metadata are other pieces of information the describe the photo or the circumstances under which it was taken. Some examples of metadata are the date and time at which the photo was recorded, the camera exposure settings, the camera brand and model, lens focal length and even the version of the camera’s firmware. These metadata are organized into a bundle and stored in the file header of your digital file. In other words, this stuff is in your TIFF, JPEG or raw file. It happens to be stored at the beginning of the file, before the pixels. Maintaining these metadata inside your digital photo file is, in theory, a good thing since this information then remains with its associated image. As long as you have the photo, the data about how, when and where it was taken are in your possession as well. Furthermore, if you make derivative copies of the digital file, such as a smaller version for display on the internet or a version to send to an editor at a magazine, the metadata are in that version of the image too. Ideally, the metadata stay with the photo wherever it goes. (Naturally there are exceptions to this which I won’t get into, but you get the idea.)
2. EXIF Metadata. An industry group (e.g., a group of computer geeks with decision making power) developed a standard, or group of widely accepted rules, for organizing these metadata. They named the standard EXIF. Each piece of information in this bundle of EXIF metadata is known as an “EXIF field”. For example, date, time, lens, camera model, shutter speed, etc. are all “EXIF fields”. So, when you hear mention of “the EXIF data”, or “EXIF header”, just know that this refers to the metadata describing when and how the photograph was taken. EXIF metadata are generally considered readonly in the sense that they should not be altered. Indeed, most image editing programs such as Photoshop will allow you to see what the EXIF fields are but do not allow you to alter the EXIF fields. This readonly restriction is really just an industry practice — there is no physical reason why the EXIF fields cannot be altered. Indeed, there are software programs out there that allow you to fiddle with and change the EXIF fields, such as time, date, camera model, etc., but I don’t have any experience with them. For the most part EXIF data are created at the moment the image is taken and there is no reason to change them later — with the exception of latitude, longitude and altitude.
It should be mentioned that there are some other bundles of metadata that may be found in the header of your digital photo and which can be viewed with image management and editing software. XMP and IPTC are two of them. XMP is a more recent standard that, in the long run, may prove to be more flexible and useful than EXIF which has some shortcomings. IPTC is another group of metadata fields, developed for press photographers to store descriptive information about their news photographs. IPTC is the place where you would enter a caption, keywords and copyright restrictions about the photo. While XMP and IPTC are important groups of metadata for digital photographers to understand, I will only be describing the EXIF metadata since that is where latitude and longitude fields are.
3. The Latitude, Longitude and Altitude Fields in the EXIF Metadata. There are three EXIF fields of interest for geocoding: latitude, longitude and altitude. While there are some recent cameras that support communication with GPS equipment in real time and fill these EXIF fields when the photograph is taken, most of us will find that these fields are empty or do not exist in the EXIF metadata of our photographs. Essentially this is because the camera is unaware of your latitude, longitude and altitude. Sure, the camera probably knows the date and time (you set these when you first get your camera) and it sure knows what lens is being used and what the shutter speed is. But in general, your camera does not know where you are. The EXIF standard includes fields (spaces) for latitude, longitude and altitude. But since the camera does not know your location when the photo is taken, these fields are left empty. Its up to you to fill them in later by geotagging the photo after it has been downloaded to your computer.
4. Recording GPS Data. There’s not much to say here. Simply purchase a GPS that supports tracking latitude/longitude to data files, then carry the GPS with you and make sure it is tracking your location while you while you shoot photos. I use the relatively small Garmin 60CSX model, which is capable of determining latitude, longitude and altitude to within about 20′. That is accurate enough for my purposes. I installed a 4GB micro-SD memory card in the Garmin 60CSX and set up the tracking options so that when I turn it on it automatically records latitude, longitude, altitude and time to a file on the memory card. There are various spatial and/or temporal intervals at which points on the track can be recorded; I have chosen 10-second intervals. (On one flight I made I chose a mode in which a location point was recorded to the track whenever the plane had travelled more than about 20 yds so it recorded many points during the flight. The result was a big GPS data file but very accurate geotagging later when the location data were stored in the photos.) On a recent 10-day trip, during which I had the unit recording about half of all daylight hours, I found that less than 1% of the 4GB micro-SD memory card was used to store tracking data. I have a few multi-week trips planned in 2009 and 2010, and this setup should record the GPS data for every moment of the trip with no trouble. I do find that I have to change batteries at least once a day if the unit is continuously operating, so rechargeable AA batteries are the way to go. The files that are created on the memory card are “GPX” files; GPX is simply a form of XML text file that geotagging programs understand. On my Garmin 60CSX, one file per day is created containing all the GPS tracking data for that day (even if I have turned the GPS on and off several times during the day). When I return from a photo outing, I can either connect the GPS to my computer and transfer the GPX files from the GPS to the computer, or I can remove the memory card from the GPS and plug the card into my computer and access the GPX files that way. Ultimately, I put all of them into the same folder on my computer, a growing pile of GPS files that just sit there until I need them for geotagging. Here are two examples of GPS tracks. You will need Google Earth installed to view these. The first is a trip around Vancouver Island. The second is a bike ride around UCSD in La Jolla. Note no photos are shown with these tracks, that will come later.
|Vancouver Island track (view in Google Earth, Google Maps or Live Search Maps)||UCSD bike ride track (view in Google Earth, Google Maps or Live Search Maps)|
5. Geotagging: Merging GPS Data Into Your Digital Photos. You have finished the shoot and you carried your GPS with you the whole time, tracking your location while you took photos. You are back in your office and its time to do the geotagging! This is the point in your workflow where latitude and longitude metadata are transferred from your GPS into the EXIF fields within your digital photos.
Everyone shooting digital photos understands that photos must be “downloaded” to one’s computer, right? Well, the same goes for the GPS data: you must download the data files from your GPS unit as well. The actual geotagging, where the photos and GPS data are combined, can occur at two points in the process: either while the images are being downloaded (i.e., copied) from your camera to the computer, or after all the files have been downloaded and are sitting on the computer in separate folders.
I use the latter approach, and here is how I do it. I use a nifty little program named GPicSync, available free from Google. I make no claim as to its performance, but I have found that it works well for me.
- Download the GPX tracking files (i.e., files with extension .gpx) from the GPS unit’s memory card to a folder on my computer. The Garmin 60CSX happens to make one GPX tracking file per day with names like 20080811.gpx; other GPS units may be different in this regard. I place all the GPX files in a single folder; mine happens to be named “c:/gpx” but you can put them wherever you wish.
- Download the files from my camera’s memory card(s) to another folder on my computer. Let’s say the folder is named “c:/pics”. The files produced by my camera are raw files, but they could just as well be JPGs.
- Launch GPicSync. I first specify the folder where the GPX files are located. I just point it at the entire group of GPX files and it figures out which ones it needs. Then I specify the folder where the digital photos are located. Lastly, I specify the “UTC Offset”. This is the number of hours between the location where the images were shot and Greenwich Mean Time. This is needed because my camera’s internal time zone is local to me, but the GPS records time using Greenwich Mean Time. If the photos were taken in my neck of the woods (Pacific Time Zone) then the appropriate difference is -7 hours, so I enter -7 for the UTC offset. Then I press “start”.
- What happens? GPicSync looks at the time at which each photo was taken, compares that to all possible GPS tracking points that it finds in the GPX tracking files and finds the closest match. In other words, it determines which GPS point was recorded closest in time to when the photo was taken. The latitude, longitude and altitude are extracted from that GPS point and inserted into the appropriate EXIF fields in the digital photo. Nothing else is altered (hopefully!) and the digital photo is written back to the computer disk. In essence all that is changed is three EXIF fields in the digital photo, all the other information including the image pixels themselves are unchanged. At least this is how it is supposed to work, and so far I have encountered no problems.
- I am now free to continue on with my workflow and raw files, converting them into JPEGs and preparing them for display on the web and delivery to clients. They are now geotagged so (again, hopefully) the software that I use to manage my photo collection, make JPEG versions for the web and high res TIFFs for clients ensures that the location information in the EXIF metadata is passed along from one generation of the photo to the next.
There are other software programs that can do this geotagging step. I chose GPicSync primarily because it supports tagging Canon raw files. In other words, it will go ahead and permanently alter the EXIF data (adding latitude, longitude, altitude) in my Canon raw files. This is a requirement for me, since I want as much metadata in the source image (my raw file) as possible. However, if you shoot JPEG then you have other options. Notably, I should mention a program named Downloader Pro, made by the developer of Breezebrowser Pro. I have been a longtime user of Breezebrowser and love it, it is perhaps the fastest and easiest image browsing program out there. When I set out to geotag my photos I planned to use Downloader Pro (a companian program to Breezebrowser Pro), but I soon found that it has a limitation that I cannot work around: it will not geotag Canon raw files (.cr2). I shoot exclusively raw files. I generate JPG and TIFF files from the raw files, but the “master” file is a raw file and the master file is the one I want geotagged. I emailed Chris Breeze (the maker of Breezebrowser and Downloader) about this and asked him why his program does not support geotagging Canon raw files. His reply was that by altering the EXIF header in the proprietary raw file one can damage the raw file, thereby making it unreadable by other software. (I understand that this is a risk, but since I can always save an unaltered copy of the raw file before it is tagged, I can work around that issue easily.) In any event, at the time of this writing Downloader Pro does not geotag Canon raw files, so I don’t use it. If and when that changes, I will immediately reconsider and probably start using Downloader Pro.
One further note on this step: consider backing up your original digital photos before proceeding with the geotagging. The geotagging that I describe alters the EXIF information in your digital photo. In the event that the geotagging software you are using has a flaw, it could corrupt the digital photo beyond repair. GPicSync, and probably most other geotagging programs, allows you to make a backup copy of each photo while the geotagging is being done, ensuring that you have a safe copy in case something bad happens to the altered copy. Needless to say, its probably a good idea to use this option if it is available, at least with shoots that are important.
6) Geo Data Flows To The Web. OK, you have geotagged your photos. Now what the heck do you do with them? Good question. I honestly don’t know all that is possible with the software that is out there. I use Expression Media to manage my collection of 22,000 images, most having 3-5 versions apiece, including keywording, cataloging, captioning, ranking, etc. Since Expression Media allows me to view the EXIF fields of my photos, I can check that the geotagging worked and that the correct latitude, longitude and altitude appear in the EXIF metadata. Great, but that is merely an exercise and does not really move me or my photos forward.
Importantly, I exploit the lat/long (latitude, longitude) data on my web site. Each photo in my collection has a corresponding record in a big database on my web server. The database has entries for location, species, keywords and a bunch of other database stuff. This database is created by Expression Media and then uploaded to the web server. This means that if Expression Media sees that an image has been geotagged, that lat/long information will flow from Expression Media to the web database. In other words, images that have been geotagged have lat/long entries in the database record, while images that have not been geotagged are missing entries lat/long entries in the database. When my website software (written by me using PHP and MySql) displays information about a geotagged image, it will notice the lat/long entry in the database and pass that information along to the display that the website visitor see. For instance, take a look at the summary information for this image of the Wave in southern Utah:
You will see not only the coordinates of the location where this image was shot, but a few links related to the coordinates and some small blue ball icons as well. What are those? Read on for the most interesting part of this whole process.
7) Geotagged Images In Google Earth. Google Earth is an amazing world visualization product from Google. At present it is available in a free version and a commercial version. I have only used the free version. I am sometimes blown away by what can be done with it. For starters, it allows one to visually fly around the world and then zoom in close, seeing Earth features from a birds eye view. That alone is pretty fun. But it gets better for photographers.
I should mention that if you do not have Google Earth installed on your computer, the discussion below will be merely academic. You won’t be able to check out the examples I mention without first installing Google Earth. Instead, click the “Google Maps” version of each link, but know that Google Maps is the lesser sibling to Google Earth when it comes to presenting geospatial stuff.
It is possible to generate Google Earth “overlays” that allow one to display almost anything in concert with Google Earth. These Google Earth overlays are similar in some ways to web pages that you view in a web browser but they are instead viewed in Google Earth, which is like a browser but for viewing the globe rather than text. For web visitors that have Google Earth installed on their computers, clicking on one of these Google Earth “overlay links” allows them to view things within Google Earth, usually in a meaningful spatial context. For instance, here are two overlays that together summarize the keepers we got in Tofino a few weeks ago. The first link presents the tracks, showing where we hiked (green), boated (purple) and flew (orange). The second link superimposes some photos above the sites where they were taken. Load both of these links in Google Earth:
Tofino tracks (view in Google Earth, Google Maps or Live Search Maps)
Tofino photos (view in Google Earth, Google Maps or Live Search Maps)
One of most oft-mentioned examples of a “GeoBlog” — a blog that is customized for Google Earth — is that of noted primatologist Jane Goodall’s Gombe Chimp research group, which publishes a blog about their ongoing activities. The blog is “geo-enabled”, meaning that not only can it be viewed as a traditional web page but it can also be viewed in an enhanced form within Google Earth at the exact location where the research is being conducted in Africa. Look for the little blue ball icons on the blog, indicating Google Earth-enabled links.
I’ve done a similar thing with most of the major parts of my web site, including the blog and the individual images. For instance, most pages on my blog are now geo-enabled. Here’s an example of an individual post of mine that is geo-enabled. The first link below just shows the blog post, while the second link displays it in Google Earth at the proper location on Granville Island in Vancouver:
http://www.oceanlight.com/log/granville-island-public-market-vancouver.html (web page)
http://www.oceanlight.com/log/granville-island-public-market-vancouver.kml (view in Google Earth, Google Maps or Live Search Maps)
(Also, the entire blog is available as a “KML Feed”, meaning that it is a feed accessible by Google Earth. The KML 2.0 link for this is at the bottom right of the blog, under “Meta”.)
Each individual image of mine that is geotagged can be viewed in geospatial context in Google Earth, at the exact point on the globe where the photo was taken. The first of the two links below shows a detailed view of the photo on a boring web page, while the second link displays the image in Google Earth at the point in the Paria Vermillion Cliffs Wilderness where the Wave is situated and the photo was taken:
http://www.oceanlight.com/spotlight.php?img=20608 (web page)
http://www.oceanlight.com/20608.kml (view in Google Earth, Google Mapsor Live Search Maps)
Lastly, you can view an entire collection of my images altogether in Google Earth. This was the hardest part for me to figure out, and required some geeky programming to get it right. But its now pretty powerful (at least I think so). In one fell swoop I can show you all of my images from, say, San Clemente Island, superimposed on the spots where they were taken. Or I can do this with all my blue whale images. Or those from the Galapagos, or Guadalupe Island. You get the idea. If you have Google Earth installed, check out these links and let me know what you think! Warning: each of these links displays dozens or hundreds of photos at once on Google Earth. You will need Google Earth installed to use these links.
|San Clemente Island underwater photos on Google Earth (View in Google Earth, Google Maps or Live Search Maps)|
|Galapagos Islands photos on Google Earth (View in Google Earth, Google Maps or Live Search Maps)|
|Blue Whale photos on Google Earth (View in Google Earth, Google Maps or Live Search Maps)|
|Guadalupe Island photos on Google Earth (View in Google Earth, Google Maps or Live Search Maps)|
Comments? Errors? Please let me know by email and I’ll try to amend this post. Thanks.
We returned from our British Columbia trip last weekend. We spent time in Whistler, Tofino, Victoria and Vancouver and managed to do a lot of biking, ziplining, hiking, flying, eating and beachcombing. During the past week I’ve reviewed all the photos we shot and edited down to a selection of about 200. You can view all of them on Google Earth, including the tracks recorded by our pocket GPS, if you have Google Earth installed on our computer) by clicking on the following link:
You will need to zoom in (say, to the Tofino region) before the individual images and tracks spread out enough to become distinct. I wrote some custom software to convert the GPS track files (*.gpx) into Google Earth code (*.kml), contact me if you would like a copy of the program (Windows only). If you just want to see the tracks, this link is better as it is not cluttered with the photos:
For the photos only, try this:
| Hikers admire the temperate rainforest along the Rainforest Trail in Pacific Rim NP, one of the best places along the Pacific Coast to experience an old-growth rain forest, complete with western hemlock, red cedar and amabilis fir trees. Moss gardens hang from tree crevices, forming a base for many ferns and conifer seedlings. Rainforest Trail, Pacific Rim National Park, British Columbia, Canada.
Location: Rainforest Trail, Pacific Rim National Park, British Columbia, Canada
View this Image in Google Earth!
Selected images from our collection of Galapagos Island photos can now be browsed in Google Earth. If you have Google Earth installed on your computer, you should be able to click on the link below and have our layer of images open up within Google Earth, showing where in the archipelago each image was taken. Zoom in to an island and the images will spread out, making it easier to select one. Clicking on an image will bring up a web page with more detail about it!
Photographs of the Galapagos Islands on Google Earth. If you do not have Google Earth installed, you can Download Google Earth to get started.
Many of our photographs of Guadalupe Island can be browsed in Google Earth through some new programming that has been added to OceanLight.com. If you have Google Earth installed on your computer, you should be able to click on the link below and have our layer of images open up within Google Earth, showing where around the island each image was taken. Zoom in and the images will spread out, making it easier to select one. Clicking on an image will bring up a web page with more detail about it!
Once we get further along with geotagging images, we can offer the same sort of displays for other places like Galapagos, Alaska, California, and Yellowstone. Currently about 15,000 of 22,000 images have been tagged.