-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
150 lines (120 loc) · 7.44 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
Welcome to Cardinal Sin
The goal of this project is to make it easy to find the photos you're looking
for, regardless of the scale of the search. For instance, it should help you
find the 5 best photos from a shoot of 1500, and it should also help you find
the long lost photos you took of the professor who just passed away, photos that
may need to run with a news story that has a deadline only hours away.
Further, this is a thought project as much as it is a software project. The
fundamental concept of this project is that the only things you care about are
sets. Amorpheous sets of photos. Anything else that you think you care about
is only a proxy for one or more of these (possibly overlapping) sets.
So, for instance:
- You don't care about numerical ratings,
- You care about whether a photo is better, worse, or about the same as
another photo.
- You care about whether a photo is suitable for a certain purpose — if your
purpose is to demonstrate that not every photo works out, then your best
photos aren't the ones you're interested in.
- You care about how well certain groups of photos fit together.
- You care about which groups of photos are better than which other groups.
- You don't care about labels or keywords or search strings,
- You care about sets of photos that are related to concepts that you have in
mind. Albeit, the way that these concepts are most frequently communicated
is by use of keywords.
My original brainstorm for the project started off as an email, which I include
below.
Note: this is as much a brainstorming email as it is a solicitation
for ideas, so it's long and a bit rambly and probably convoluted too.
That said, I think there are some interesting and perhaps useful ideas
here.
(Also, as I've continued writing this, it occurs to me that I'm really
building a theoretical system for organizing photos, and then trying
to think of ways to implement it.)
Any thoughts or ideas or feedback would be appreciated :o)
So, I've been hacking on geeqie[1] for the past couple days, and now I'm
thinking of cool things to do that would improve my ability to find images. And
really, all I ever want to do is find images. For instance, the only reason I
rate images is so I can find the good ones, or the mediocre ones, or the bad
ones.
One disconnect I've found between the way I think and the way that geeqie
behaves can be described like this: I like to think about sets of photos. The
sets only exist insofar as the elements share some common property.
### THEORY ###
Let's say this: A photo is defined as a singleton set containing a photo. A set
contains zero or more sets. Furthermore, a set only exists if I am thinking
about it.
So if I'm not thinking about a set, it doesn't need to exist, and thus stops
existing. If I start thinking about a set, it pops into existence. The important
thing here isn't really the existence of a set, but rather the property that is
shared by the elements of the set. Sets whose elements have useful shared
properties will tend to be more useful, and so they'll tend to exist. Useless
sets (*cough* the set of sets that don't contain themselves) aren't actually
useful, so people won't think about them.
So a "shoot" can be considered a set of photos. A "burst" of photos is a set —
the photos are similar and/or were taken in rapid succession. Of course, there
are different reasons why a "burst" of photos might be considered a set, and as
such, there might be a number of overlapping sets related to that burst. For
one, a burst is a set simply because the photos were taken in rapid succession.
They might also be a set because they were taken with the same intention.
Important subsets might be "images with interesting facial expressions," or
"images that form a semantic sequence — an animation".
Similarly, keywords or other properties define other sets. "Photos taken near
the Googleplex," "photos of Carlos Santana," "photos that are mostly dark,"
"photos that are much wider than they are tall — panoramas".
There exist partial orders on most sets:
- Not all photos are comparable — it can be difficult to tell whether a portrait
or a landscape is "better" on any sort of absolute scale.
- However, in many useful sets, most of the photos will be somewhat comparable.
In some sets, the partial order may be a total order (for instance, "which of
the shots from this burst is the sharpest?").
- Because partial orders exist across sets, this means that partial orders may
exist across sets of photos. This is useful when thinking about questions like
"which images will go in my next exhibition?" "Which images should I submit to
the newspaper for this photo essay?" "Which images should I send to the photo
forum for critique?" The ranking scheme in such cases likely includes not only
how good the individual photos are, but also (respectively) how well the
photos work together; how representative they are, as a group, of the thing
that was documented; or how representative they are of all the photos taken
during a shoot.
So, at this point, we've got a very hand-wavy system that allows us to think
intuitively about photos:
1) There exist photos
2) There exist natural groups of photos (it's not important yet that the natural
groups can't really be well-defined or enumerated)
3) The natural groups of photos overlap in all sorts of different ways.
Different natural groups may be useful for different reasons.
4) There may exist ways to compare some of these natural groups (this statement
is the ultimate in hand-waviness ;o)
### IMPLEMENTATION ###
This system, as described, cannot be implemented. So the goal isn't to implement
the system as described, but to implement something similar that provides most
of the benefits.
Here are some challenges:
- The sets are nebulous — they should only exist insofar as we're interested in
something. Sets that we're not interested in, that aren't useful, should not
exist.
- The implementation and the user need some way to communicate about and to
refer to these sets. Even if the implementation knows about a particular set,
that's not useful unless the user can communicate a reference to that set and
have the implementation find/show it.
Here are some ideas:
- Some of these sets can be inferred: similar images from a burst will (in most
cases) be consecutive (or nearly-so), and will have been taken in a
constrained period of time.
- "shoots" can often be inferred from the data storage scheme (directories on a
filesystem), or from other info (all of the photos were imported at the same
time, for example). Even if these inferences are wrong, it shouldn't be too
difficult to fix them manually if you can split or join "shoot" sets.
- Join sets by making them subsets of a common superset — the signals that led
to the inference may be useful for other things, even if they weren't directly
useful for inferring the "shoot" set
- In many cases, it's probably easy enough to make sets exist or not exist by
doing tricks with data structures (in C, pointers come to mind)
- If the user selects multiple photos for any reason, that selection should
become a set. If it turns out that the set isn't useful, the implementation
should be able to infer that. It's better to err on the side of too-many sets,
as long as there are good ways to deal with too-many sets, such as some sort
of LRU or LFU[2] scheme.
--xsdg
[1] http://geeqie.sourceforge.net/
[2] http://en.wikipedia.org/wiki/Cache_algorithms#Least_Recently_Used