Avoiding Duplicate and Alternative Sets

There are some set records within the WACS system that you probably only want to show under a very limited range of circumstances. These are sets that are marked as Secondary or Continuation sets. Sets marked in this way offer nothing different or significant from the viewpoint of the end user but are useful to us as developers and site managers. Examples of these are different resolution versions of an existing set or a second part of a video clip that has been split into multiple smaller clips. For instance you might want to offer a reduced resolution image set for web site users to download to their mobile phones, or a choice of resolutions of a video clip.


The concept of Secondary and Continuation sets was introduced in WACS 0.9.0 - prior to that such sets were marked with a set type of Duplicate which proved cumbersome and difficult to make use of. Duplicate records were typically hidden using the preference exclusions mechanism we'll describe in the next section.

Understanding Link Relations

This mechanism is implemented through the srank database field. This currently is defined to have three possible values or no value - normal sets that should appear are described as primary sets and they will have an srank of P indicating it is a primary record. Where a record has no srank value, it should be assumed to be a Primary record for backwards compatibility with earlier WACS records.

In the case where a record is an alternative version of a set that already exists, it should be given the srank of S indicating it is a secondary record. In addition to this, the sduplicates field for this record should contain the set number of it's primary version and the sduplicates field of the primary version should point to this secondary set. Where there are three or more variants of the same thing, this should be a circular chain taking you to the next such set and at the final duplicate, back to the primary set. The set administration tools do not currently support setting up a three or more way chain of links, but the code shouldn't be broken by that existing within the database.

The final of the three cases is that of a continuation record. These will be given the srank of C indicating a continuation record. This srank will only be set on the second and subsequent set records of this conceptual chain - the first set in a chain with continuations will be either a Primary or Secondary set. In addition to the set being of the Continuation type, it will have a number of other fields set to help in navigation. The first of these is that the second such set, eg the first continuation, will have the sprevious field set to the set number of the first set in the sequence. If there is a second continuation set (ie third part of the whole set), the sprevious will be set to the number of the first set, and the snext will be set to the number of the third (second continuation) set.

Illustrating How Link Relations Work

Since this is a fairly complex concept, here's a diagram to try and help you understand what is going on here:

In the diagram we're dealing with four separate sets that effectively contain exactly the same scenario with the same model in the same location. The first of these in the diagram is set 123 which is a straightforward image set - nothing special about it except that it does have a corresponding video clip. It uses only one of the relationship links to link to the best quality (and therefore choosen to be the Primary ) version of the video clip which is set 124. This is the saltmedia link as Video is an alternative media to a still image set.

Moving on to the video clips, note that all three of them say that their alternative media is the single, only version of the image set. Therefore the saltmedia on each of the videos mentions the image set, namely set 123 as their direct alternative. This is fine - it doesn't have to be a symetric relationship, just true! These links are shown by the red line on the diagram.

We have three video clips and this is obviously the most complex part of the diagram and the relationships we're trying to explain. For the sake of argument we're going to say that the first video clip, set 124 is a High Definition MPEG-4 movie file weighing in at a massive 1920 x 1280 pixels and 700MBytes. It's HUGE. The other two video clips are Standard Definition WMV movie files containing the same movie at DVD resolution of 720 x 480 pixels and weighing in at 90MBytes and 82MBytes respectively. These sizes are far more appropriate for people using media players on their mobile phones, tablet computers or just simply older PCs without the power to play High Definition video properly. The movie has been broken into two approximately equal size video clips to make downloading them easier when on the move or with a limited bandwidth connection.

Starting off with the big High Definition movie clip, we can see that this is the Primary version of this set and therefore the one we want to appear in searches, new release highlights and on the simpler model pages. The other versions are no different in what they contain in terms of subject matter and the choice between them can be made once the set itself has been selected. We also don't want all three versions appearing multiply in any selection the user makes.

The first of the Standard Definition video clips contains the start of our movie and so that's always going to be the one that anything else refers to when looking for the smaller version. It's therefore classed as the Secondary version and it has links back to the primary version by way of the sduplicates link because it contains the very same footage as the Primary version does, just in a reduced size format. These are the green links on the diagram.


Note that the second part of the video, namely set 126 also links back to set 124 as it's primary as the longer complete High Definition clip contains the same scenes as the second half of the Standard Definition clip does. Of course if the High Definition clip was also split into two parts, the second half of the Standard Definition clip would link via sduplicates to the second half of the High Definition clip.

The final group of links on the diagram, those in blue, concern linking together the two sequential halves of the Standard Definition clip. Therefore set 125's snext field says "my movie continues in set 126" and set 126 uses sprevious to point back to set 125 a preceeding it. In addition the ssetpos field is set to 1, 2, 3, etc to indicate a given clip's place within the overall movie.


It's important to note that the sprevious and snext link chains are NOT circular. The sprevious of the first (Primary or Secondary) set will be null, as will the snext of the final set in the chain.

While it's not really what we're discussing here, there are various utility functions in the WacsStd perl module to aid and abet in maintaining these links if you are writing your own collection maintenance tools. Take a look at linkfromprevious and linkrelated for more information on these.

Coding For Link Relations

Hopefully after the last few sections, you now understand why the Link Relations mechanism was added and how it should work. Obviously we now need to feed this back into how we write SQL code to retrieve sets. There are two main things we're going to want to do - the first is to tailor our main retrieval pages to ignore these various alternative versions, and the second is on some occasions to detect where there might be additional icons and links we need to add to our set display code.

The code to ignore Secondary and Continuation records from our normal set index selections. This can be done with this SQL code segment:

( srank not in ('C','S') or srank is null )

As an example, if you want to select Lesbian sets in a Countryside location, you'd create a query something like:

select setno, stitle, stype, srating from sets
where slocation = 'Country' and scatflag = 'L'
  and ( srank not in ('C','S') or srank is null );

At this point we've only just started using the srank variable and the three values of Primary, Secondary and Continuation seem adequate. It is possible that we might add additional values as we find a need for them - you might therefore wish to define a variable at the top of your files - $rankfilt to set which ones you're trying to filter. At present this would be $rankfilt="'C','S'" and would be used within the brackets of the in () clause in the SQL.