# SamplingStrata 1.5-4

-	Minor changes due to the introduction of R 4.3.0


# SamplingStrata 1.5-3

## Major changes

-	As in R 4.2.0 a matrix objects now also inherit from class "array", this 
	invalidates code incorrectly assuming that class(matrix_obj)) has length one.
	So, changes have been done consequently in SamplingStrata functions where
	tests on class of an object were performed, otherwise causing errors.

# SamplingStrata 1.5-2

## Major changes

-   A new function 'selectSampleSpatial' has been added: if geographical
    coordinates are available in the frame, in order to obtain spatially
    distributed selected points this function makes use of the
    'lpm2\_kdtree' function from the SamplingBigData package
    (Tillé-Grafstrom).

-   A new function 'bethelProc' has been added: this function allows to
    execute a complete procedure from the Bethel optimal allocation to
    the selection of a sample, without having to optimize the strata,
    that are supposed to be given and fixed.

# SamplingStrata 1.5-1

## Major changes

-   As starting from R 4.0.0 ‘stringsAsFactors = FALSE’ becomes a
    default, in all calls to data.frame the parameter ‘stringAsFactors =
    TRUE’ has been indicated, in order to ensure the same results

-   Fixed some bugs related to handling take-all strata for ‘atomic’ and
    ‘spatial’ methods



# SamplingStrata 1.5

## Major changes

  - A new ‘optimStrata’ function is available: this function is a
    wrapper that allows to execute the three different optimization
    functions: (i) optimizeStrata (method = “atomic”); (ii)
    optimizeStrata (method = “continuous”); (iii) optimizeStrata (method
    = “spatial”).

  - A new ‘optimizeStrataSpatial’ function is available: this function
    optimizes the frame stratification taking into account also spatial
    correlation of frame units in a territorial context. As for
    ‘optimizeStrata2’, this function can be used only on continuous
    stratification variables.

  - A new ‘kMeansSolutionSpatial’ function has been added: this function
    is the same than ‘KmeansSolution’, but operates in the case of
    optimization with only continuous stratification variables and has
    to be used only in conjunction with ‘optimizeStrataSpatial’.

  - A new ‘KmeansSolution2’ function has been added: this function is
    the same than ‘KmeansSolution’, but operates in the case of
    optimization with only continuous stratification variables and has
    to be used only in conjunction with ‘optimizeStrata2’.

  - A new ‘prepareSuggestion’ function has been added: this function
    operates on the result of ‘kMeanSolution2’ or ‘kMeanSolutionSpatial’
    order to prepare the suggestions for the optimization with only
    continuous stratification variables.

  - A new ‘computeGamma’ function has been added: this function allows
    to calculate a heteroscedasticity index, to be passed to
    optimization step as a ‘model’ parameter in order to correctly
    evaluate the variance in the strata.

# SamplingStrata 1.4-1

## Major changes

  - A new ‘summaryStrata’ function has been added, enabling to get
    structured information regarding the strata produced by the
    ‘optimizeStrata2’ function (operating on only continuous
    stratification variables).

  - A new ‘assignStrataLabel’ function has been added, enabling to
    assign the optimized strata label to new units to be added in the
    sampling frame.

  - Fixed a bug in the optimization step for continuous variables

# SamplingStrata 1.4

## Major changes

  - A new function ‘optimizeStrata2’ is available. This function
    performs the same task than ‘optimizeStrata’, but with a different
    Genetic Algorithm, operating on real values genome, instead of an
    integer one. This pemits to operate directly on the boundaries of
    the strata, instead of aggregating the initial atomic strata. In
    some situations (limited size of sampling frame) this new function
    is much more efficient. The limitation is in the nature of the
    stratification variables, that are required to be all continuous
    (though categorical ordinal could be handled).

  - A new function ‘expected\_CV’ has been added to calculate CV’s on
    target variables in different domains that may be expected from a
    given solution, output of the ‘optimizeStrata’ execution.

  - Fixed a bug in the optimization step when considering also take-all
    strata

# SamplingStrata 1.3

## New functions

  - The optimization of population frame is run in parallel if different
    domains are considered. To this end, the parameter ‘parallel’ can be
    set to TRUE in the ‘optimizeStrata’ function. If not specified, n-1
    of total available cores are used OR if number of domains \< (n-1)
    cores, then number of cores equal to number of domains are used.

  - A new function ‘KmeansSolution’ produces an initial solution using
    the kmeans algorithm by clustering atomic strata considering the
    values of the means of target variables in them. Also, if the
    parameter ‘nstrata’ is not indicated, the optimal number of clusters
    is determined inside each domain, and the overall solution is
    obtained by concatenating optimal clusters obtained in domains. By
    indicating this solution as a suggestion to the optimization step,
    this may greatly speed the convergence to the optimal solution.

  - A new function ‘selectSampleSystematic’ has been added. It allows to
    select a stratified sample with the systematic method, that is a
    selection that begins selecting the first unit by an initial
    randomly chosen starting point, and proceeding in selecting other
    units by adding an interval that is the inverse of the sampling rate
    in the stratum. This selection method can be useful if associated to
    a particular ordering of the selection frame, where the ordering
    variable(s) can be considered as additional stratum variable(s).

  - It is now possible to handle “anticipated variance” by introducing a
    model linking a proxy variable whose values are available for all
    units in the sampling frame, with the target variable whose values
    are not available. In this implementation only linear model can be
    chosen. When calling the ‘buildStrataDF’ function, a dataframe is
    given, containing three parameters (beta, sigma2 and gamma) for each
    couple target / proxy. On the basis of these parameters, means and
    standard deviations in sampling strata are calculated accordingly to
    given formulas.

# SamplingStrata 1.2

## Major changes

  - The crossover function in the genetic algorithm has been modified by
    considering the “grouping” version of this algorithm: instead of
    mixing chromosomes in an indifferentiate way, groups of them in one
    parent (representing already aggregated strata) are attributed to
    the other parent when generating a child, preserving their
    composition. Moreover, parents are selected with a probability
    proportional to their fitness (in the previous version selection was
    completely at random). In many cases this can greatly speed the
    convergence to an optimal solution.

  - A new function ‘adjustSize’ has been added. It allows to adjust the
    sample size and related allocation in strata on the basis of an
    externally indicated overall sample size. The adjustment of the
    sample size is perfomed by increasing or decreasing it
    proportionally in each optimized stratum.

  - A new function ‘buildFrameDF’ has been added. It allows to create a
    ‘sampling frame’ dataframe by indicating the dataset in which the
    information on all the units are contained, the identifier, the X
    variables, the Y variable and the variable that indicates the
    domains of interest.

  - In function ‘optimizeStrata’ now the ‘initialStrata’ parameter is a
    vector, whose length is equal to the number of strata in the
    different domains.

  - All outputs are written to an .subdirectory.

# SamplingStrata 1.1

## Major changes

  - Function ‘memoise’ from the same package (now required) is applied
    before each evaluation in order to save processing time. This may
    largely increase the efficiency of the algorithm.

  - A ‘recode’ function is applied on every generated solution in order
    to recode a genotype of n genes with k\<=n distinct alleles 1, 2, …,
    k in such a way that the distinct alleles of the recoded genotype
    appear in the natural order 1, 2, …, k. This avoids to consider as
    distinct two solutions that are equivalent but make use of a
    different coding.

# SamplingStrata 1.0-4

## Bug fix for old releases

# SamplingStrata 1.0-3

## New functions and changes

  - Modified the output to the console or to the file of results: of all
    the solutions, only the optimal value for each generation is
    visualised.

  - Now the visualisation of the trend in the optimal and mean values is
    optional: the plot can be avoided by setting showPlot = FALSE when
    calling optimizeStrata. It can be advisable when the number of
    iterations is very high.

# SamplingStrata 1.0-2

## Bug fix for old releases.

# SamplingStrata 1.0-1

## Major changes

  - The object returned by function “optimizeStrata” is no more a
    dataframe but a list: (i) the first element of the list is the
    solution vector
    (solution\(indices); (ii) the second element of the list is the dataframe containing aggregated strata (solution\)aggr\_strata).

  - In all the functions that previously produced .csv files and .pdf
    plots in the working directory, as a default this is no more the
    current behaviour. To write these files, it is now necessary to set
    the “writeFiles” flag to TRUE.

# SamplingStrata 1.0

## Major changes

  - Two new functions: (i) ‘evalSolution’, to evaluate the found
    solution in terms of expected target variables precision and bias
    obtainable by samples drawn from the otpimized frame; (ii)
    ‘tuneParameters’, to determine the best combination of values to
    assign to the parameters necessary for the execution of the genetic
    algorithm used for the optimization of the frame stratification.