Less is more: Signatures, cities and hash codes

In 1997, the Hubble Space Telescope’s Imaging Spectrograph (STIS) picked up the signature of a black hole. Instead of a vertical straight scan line, the STIS showed an S shape. Signatures point to the presence of something (a referent). But you see the thing only indirectly, i.e. you get to know that the referent exists, even though you can’t see it simply by looking for it, or at it … even if you could.

From what I have read, the COVID-19 virus also produces a genomic signature, or perhaps several, depending on the means of detection and measurement. According to the OED, a signature is: “Any typical physical or behavioural characteristic, pattern, or response by which an object, substance, etc., may be identified.” As is the case with the S-shaped blip on the black hole spectrograph, a signature doesn’t bear a resemblance to the thing it signifies. Here’s a picture of the earth-bound Jodrell Bank radio telescope for picking up extra terrestrial signatures.

Social signatures

A signature takes up less space than its referent, and can be used as a stand in during various calculations or logical operations, like moving name tags (signatures) around while organising the seating at a reception. That’s substantially easier than moving the people (referents) around.

I’ve been searching for examples of the use of “signature” in descriptions of large scale social entities, like cities, societies or political movements. I came close with an article on “emotional signatures.”

“Political ideology thus has a discrete emotional signature, one favoring anxiety among conservatives and anger among liberals” (98).

In the context of that particular study, anxiety was a means of identifying the presence of a conservative mindset. The emotion of anger indicated a liberal mindset. An emotion, or at least an emotion word, served as a signature of a group.

Urban signatures

There are plenty of businesses, buildings and cities claiming to be signatures, landmarks, showcases and flagships. Digital tech contributes to the concept of an urban signature. For example, some scholars have investigated the flow of people moving around a city derived from traces left across the city by images uploaded to photo sharing websites.

Researchers used that as an index of the city’s cultural and social structure. It’s a bit like detecting the rapid flow of material around a black hole to determine its presence and size, even though you can’t see the black hole. Terms such as footprint, fingerprint and profile also come to mind as substitutes for signature in an urban context.

Data signatures

I’m examining cryptography as an urban phenomenon. That foregrounds the concept of hash coding, important in search, indexing, data access, authentication, security, encryption and cryptocurrencies. Some scholars write about the hash code of a file or document as its signature. It’s an algorithmically generated much abbreviated form of the document that serves as its unique identifier — its signature.

Here I want to show how a hash code belongs to a family of data translation operations relevant to the way graphical, 2D and 3D objects and elements are represented in databases. That might be relevant to signatures of the city, or at least of a city’s inert visual elements.

Hash code

A hash code is a compressed version of a file. The hash is a string of characters of uniform length, depending on the hashing algorithm used. It is derived from the original (referent) data, whatever the original size and format of that data. It bears no resemblance to the original data for either a human reader or machine. See posts about hash coding.

Assuming a robust hashing algorithm, the hash string cannot be reverse engineered to recover the original data, even if you or your computer has access to the algorithm that produced the hash. The hash string is a unique identifier (signature) of its referent, and is therefore extremely useful in database management, search, listing, ordering and processing data. Most importantly, if you run the same referent data through the same hash algorithm it will produce exactly the same hash string. So hash coding is used for verifying that a file is not corrupted or tampered with in transit.

I’ve yet to discover a direct spatial analogue to hash coding, at least in a way that’s relevant to the spatial arrangement of cities. That said, the family of translation operations to which hashing belongs is relevant, at least in the way spatial urban data is represented graphically, e.g. aerial imagery, street patterns, property boundaries.

Data compression

A hash code is a form of data compression. That’s the condensation of something large or complicated down to something small and more portable, and that holds the ability to identify its referent. Image compression, also known as downsampling, provides such a function. In that case a full colour, high definition image gets downsampled to something smaller. The usual use of compression is to create a small version of an image file, but to make it as visually close to the original as possible.

Lossless data compression

Lossless data compression produces a smaller file, but no information is lost. In the case of an image file, if the sky is a uniform blue colour then the lossless compression algorithm counts the number of pixels of the same colour along a scan line in the image and stores information about the range of pixels with that colour. That’s the run-length encoding method. It’s more efficient than storing colour information about every single pixel.

“Lossy” compression

“Lossy” compression, such as JPEG image compression, adopts more cunning methods, such as reducing the palette of pixel colour values where there are a lot of adjacent pixels with high contrast. The eye is less sensitive to colour variation across a busy part of the picture compared with large areas of subtle colour variation. What is important here is that the method strips away a substantial part of the image information, thereby reducing the file size. However, once it is lost, there’s no way to bring that information back from the compressed file.

Scaling down the size of an image involves a simpler method than JPEG compression. It simply averages the colour value of every square group of pixels (e.g. 8×8) to a single pixel. Information gets lost in the averaging process. It’s not possible to reconstitute the original information from a set of averages.

I think of such compression as meeting some of the functionality of a hash code, thereby serving as a rudimentary signature of the uncompressed image.

Encryption

This is the translation of data into an unreadable format. An encrypted file is about the same size as the original, its referent. It’s no more portable than the original, and is only of use to someone with the decryption key, by which it can be restored. Encryption and hash coding are related, but an encrypted file doesn’t deliver the functionality of a signature. See posts on encryption.

Data translation

That’s simply the translation of data from one format to another, such as converting DWG to OBJ for 3D files, or PNG to GIF for image files. These formats are legible to a wide range of software applications and editors.

Here’s a table summarising these different aspects of data conversion methods.

Reduced size Needs a key Unique to referent (potentially) Referent is recoverable Used for security
Data translation sometimes no yes mostly no
Encryption no yes yes yes yes
Compression (lossless) yes no yes yes no
Compression (lossy) yes no yes (except for extreme downsampling) no no
Hash code yes n/a yes no yes

I’m part way to showing how signatures and hash coding permeate the city (for a later post).

Semiotics of signatures

As I have been studying semiotics and the philosophy of C.S. Peirce, I am bound to note that a signature (and a hash code) is a rhematic-indexical-legisign. People don’t look like their hand written signatures. Anything can be said to resemble any other thing in some way (shape, colour, sound, smell, size, orientation, function, place, etc). That’s in the nature of metaphor and figurative speech. But the use of a signature doesn’t depend on establishing a resemblance. That would be an icon, e.g. a small picture or thumbnail of yourself in a blog post or tweet.

A signature is an index in that it exhibits some kind of causal connection with its referent. The person to whom the signature refers makes the signature. By a complicated chain of causes a black hole makes a signature on a spectrograph. The recognition and use of a signature depends on a set of conventions (legisign). To be useful it has to refer to something else (a referent), a bit like a pronoun in a sentence. It is functionally incomplete without its referent (it’s a rheme). See post: Sign here.

Extreme compression

If graphical image compression continues it produces something with about as much information as a thumb nail image. That would be an icon, not an index. Even more compression produces just a blurry blob of colours, perhaps just a few pixels. That could provide just enough information to retain unique identification with the original, its referent. In that case the compressed image looks more like a signature and can function as a hash code, though not as secure.

References

  • Bower, Gary, and Richard Green. 1997. STIS records a black hole’s signature. Hubblesite, 12 May. Available online: https://hubblesite.org/contents/media/images/1997/12/477-Image.html?news=true (accessed 9 June 2019).
  • Furno, Angelo, Marco Fiore, Razvan Stanica, Cezary Ziemlicki, and Zbigniew Smoreda. 2017. A Tale of Ten Cities: Characterizing Signatures of Mobile Traffic in Urban Areas. IEEE Transactions on Mobile Computing: Institute of Electrical and Electronics Engineers, ( 16) 10, 2682-2696.
  • Girardin, F., J. Blat, F. Calabrese, F. Dal Fiore, and C. Ratti. 2008. Digital Footprinting: Uncovering Tourists with User-Generated Content. IEEE Pervasive Computing, (7) 4, 36–43.
  • Robinson, Michael D., Ryan L. Boyd, and Adam K. Fetterman. 2014. An emotional signature of political ideology: Evidence from two linguistic content-coding studies. Personality and Individual Differences, (71)98-102.

7 Comments

    1. That’s very helpful Jon. Nice brevity.
      (OED) Paraph: “A flourish made after a signature, originally as a precaution against forgery”
      Relates to “paragraph.” https://www.etymonline.com/search?q=paragraph
      “from Medieval Latin paragraphus ‘sign indicating the start of a new section of a discourse’ (the sign looks something like a stylized letter -P- and a version of it still is used in copy-editing), from Greek paragraphos ‘short stroke below the beginning of a line marking a break in sense,’ also ‘a passage so marked,’ literally ‘anything written beside,’ from paragraphein ‘write by the side,’ from para- ‘beside’.
      It’s a bit like Jeremy Bearimy: https://www.youtube.com/watch?v=RFm9ClqlGuo.

      1. Jon Awbrey says:

        Also see Panache.

        (But don’t get me started on Pilcrow.)

  1. Your reference to run-length encoding reminded me of its extension to scan-line coherence, the idea that the next scan line is likely to be very similar to the current one and that you only needed to identify differences.This can be extended to frame-coherence where in most instances the next whole frame is unlikely to differ greatly from the current frame.

    1. Indeed. Storing differences can also be lossless.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.