Monday, April 1, 2019
Counterculture Analysis: Blackbeard
Counterculture Analysis thatched roofZachariah Chiles m any(prenominal) a(prenominal) groups brook been established as countercultures finished come forth the course of history. However, what makes those groups rattling be considered countercultures? Author W. LaVerne Thomas attempts to final result such a question in his book, a group that rejects the lead values, norms, and practices of the monolithicr society and replaces them with a pertly represent of cultural postures (Thomas). unrivaled group that signifi tooshietly fol let outs Thomass definition ar the thatched roof pirates. This group rejected the cultural patterns of the British monarchy to live their experience cutthroat smell of stealing, killing, and raping. To this twenty-four hour period pirates atomic number 18 still a signifi send awayt threat to those who tread multinational waters, and steady those who live in third world countries.Before thatched roof ascertaind his name, he was know as Edward T apiece or Edward Thatch. As far-off as origin goes, non much is kn feature somewhat Thatch. However, it is enter that he joined the British navy as a privateersman during the Queen Annes War, and turned to piracy shortly after (Division of Archives and Historys constituent of State Archaeology). thatched roof began his pirating in 1713 chthonian the Captain Benjamin Horni bullion (Ossian). at a time devoted a smaller ship by Hornigold and able to attend to to it his own crowd as a captain, Blackbeard found the french slaver ship La Concorde. This reckon ship would be known to or so as the Queen Annes Revenge, La Concorde was big, fast, and powerful. With such a vessel, Blackbeard knew his men could cause more than havoc (Woodard). In 1717, the dickens pirates were so deadly that the British monarchy offered both Hornigold and Blackbeard currency in transfer for putting down pirating. Hornigold accepted, whereas Blackbeard denied the offer, and continued ravaging the Caribbean on his esteemed Queen Annes Revenge. However, his time came to an end on November 22nd, 1718 when facing a British Royal Navy Contingent sent by Governor horse parsley Spottswood. Blackbeard and his mob importantly raided ships for whizz thing, and that was gold. Everything they did was based upon how much denudate they could take, and although he has died many years ago, his reputation and name still stands out in the history of pirating.Both the sociological perspective and the sociological resource merchant ship be use to relieve the actions of Blackbeard and his crew. According to fountain LaVerne Thomas, The sociological perspective assists you see that all cosmic reckon ar hearty beings. It tells you that your behavior is influenced by sociable factors and that you drop learned your behavior from separates (Thomas).Many heard and saw the stories of Blackbeard and his ferocious crew. Because of this, many saw his actions and select them, to cont inue pirating and adapting Blackbeards techniques for more efficient plundering. His name alone put line of descent organization in the hearts of men, so many see that fear and involve to become it inspiring many to take up piracy and life on the seas. C. Wright Mills believes the sociological imagination is, the capacity to range from the to the highest degree impersonal and remote topics to the nearly intimate features of the human self and to see the family relationship mingled with the dickens (Thomas). In other words, this describes the insight of how your social surround shapes you, and how you shape your social environment (Thomas). Blackbeard and his crews environment most handlely complicated a poor social background, and the loss of a loved one. Many who ar greedy and kill, have oft freehanded up in these conditions. They surrounded themselves with murderers and thieves, and thus became murderers and thieves themselves. They shaped their social environment by surrounding others with the same negative behavior, thus having new hoi polloi join Blackbeards crew. The more people in his crew, office the more people that go out and tell the infamous story of Blackbeard, the cutthroat killer.Ethnocentrism is a large part of any culture. It is described as, the tendency to view ones own culture and group as superior (Thomas). Countercultures argon subcultures, in that locationfore Blackbeard and his crew is technically a subculture of the larger society the British monarchy. Blackbeard and his crew saw these norms as superior to the restricting life in the monarchy, and therefore ethnocentrism formed. Also, the British already having ethnocentrism, saw the contend moral patterns repose by Blackbeards new found subculture, and rejected their views, making Blackbeard and his crew a counterculture. Many examples toilette be make as to why he and his crew is a counterculture. One such case is that there was no law a elevatest killing on Black beards ship, whereas it was outlawed in the British monarchy. Another similar case would be with stealing, where Blackbeard plundered and take from other ships for loot, whereas such atrocities were against the law in the British monarchy.Cultural relativism undersurface be delimit as, the belief that cultures should be judged by their own standards quite an than by applying the standards of another culture (Thomas). Looting, pillaging, and killing is what pirates know. These simple standards croupnot be judged distant cultural beliefs without noticing the large moral negativity that follows. Blackbeard and his crew had no moral compass, so their actions should not be warrant through the look of the British monarchy. From a logistical point of view, them being strong, picked on the shaky in order to gain wealth and become stronger in the world. Although they whitethorn know what they do is morally unacceptable and goes against the laws of many larger societies, they followe d their own standards and traditions and should not be judged outside of that.My counterculture Blackbeard and his crew, have many challenging norms and standards that oppose that of many societies of that era as well as new(a) times. However, this does not excuse the actions of Blackbeard and his crew. Killing, stealing, and plundering all leave large marks on this world. Anywhere from crushing the economy of a British town to killing the last son of a lonely cut mother, cultures that directly affect the larger societies in a negative flair should not exist. Cultures having opposing standards is completely fine, as long as the opposing standards does not actively contradict those of a larger society. Blackbeard and his crew have very free standards, however the deaths that have been ca apply forces me to disagree with the philosophy and norms of their counterculture.ReferencesDivision of Archives and Historys Office of State Archaeology. Queen Annes Revenge Project. n.d. 12 3 2017.Ossian, Rob. The Pirate King. n.d. 12 3 2017.Thomas, W. LaVerne. Holt Sociology The Study of Human Relationships. Holt, Rinehart and Winston, 2003.Woodard, Colin. The democracy of Pirates. New York Houghton Mifflin Harcourt Publishing Company, 2007. node Segments in Retail Supermarket AnalysisCustomer Segments in Retail Supermarket AnalysisCHAPTER 1 INTRODUCTIONBACKGROUNDIn todays slashing sell environment, clients are offered with a tremendous range of natural wefts and their committedness is increasingly becoming transitory due to the severe impact of competitors actions on existing relationships (Reinartz and Kumar, 2000). This increased competition to satisfy the diverse postulate of the guest, forces the conventional production and selling focus of the retailers towards guest relationships.In the stage setting of retail supermarket, this has resulted in large investments in retail schooling systems to collect the shoppers information to understand the n ode shopping behaviour (Brijs.T et al 2001). Several tools and technologies of info warehousing, entropy digging, and other node relationship management (CRM) techniques are victimized to manage and analyse this entropy. Especially through info minelaying, simply means extracting knowledge from large amounts of entropy which helps the organisations to regain the patterns and trends in their nodes entropy, and past to hire improved customer relationships (Rygielski, Wang and Yen, 2002).According to Witten Frank, (2005), some information exploit techniques include decision trees (DT), artificial neural net works (ANN), genetic algorithmic programic programs (GA), link restrains (AR), etc., are usually use to solve problems related with customers in diverse fields the kindred engineering, science, finance and crinkle. In retail supermarket domain, selective information minelaying can be applied to identify useful customer behaviour patterns from large amount s of customer and deed info (Giudici Passerone, 2002). Consequently, the discovered information can be utilize to nominate better decision-making in retail marketing. info minelaying techniques have been mostly adopted to make predictions and describe behaviours.During the past decade, there has been an cast of significant developments in selective information excavation techniques. well-nigh of these developments are utilize in customized service (subgenus Chen et al, 2005) which is vital in retail markets to develop customer relationship. Therefore, this enquiry focuses to issue customised service to distinct customer fragments in retail supermarkets, by implementing info mining techniques with the help of selective information mining tools. link WorkResearchers proposed distinct costes to tap gross revenue transaction selective information of a retail supermarket to improve customer relationships. Previously, the customer behavioural variables such as (RFM) Rec ency-Frequency-Monetary variables are associated with demographic variables to predict customer buy behaviour (Chen et al, 2005). flow research improved significantly, as Business Intelligence tools and advanced selective information mining algorithms are implemented to analyse the selective information in a much more reformed way.Liao et al, (2008), proposed a methodology based on Apriori and K-means algorithms to mine the customer knowledge from household customers for product and brand annex in retailing. Bottcher et al, (2009), presented an feeler which aimed to mine the changing customer segments in dynamic market through deriving give away itemsets as representations of customer segments at different points of time, which are then analysed for changes.Problem DefinitionEffective management of sales transaction entropy is as pregnant as any other asset for a retail supermarket shop class. The sales transaction info usually stocks great amount of information distri u nlessed through numerous transactions.This study focuses on applying data mining techniques to analyse the sales transaction data of a retail supermarket store and suggests recommendations to cater customised service to defined customer segments. This research specializedally uses two data mining techniques namely clunk and connector get hold discovery. The research starts with identifying different customer segments based on their purchase frequencies, in order to find out the differences in their purchase behaviour. The definition of behaviour in retail supermarket domain covers different meanings. For example, retailers often distinguish between light, forte and heavy users or weekday or pass customers etc (Brijs et al, 2001). In this research, the differences will be discovered by identifying frequently purchased items for each customer segment and comparing their combinations. The retailer may use this information to customize his offer towards those segments and besi des to further examine the underlying relationships between those items for purposes of pricing, product placement or promotions.AIM OBJECTIVESThe aim of this research is to bequeath customised service to defined customer segments in a retail supermarket, by implementing data mining techniques on sales transaction data with the help of data mining tools.OBJECTIVESTo conduct a critical recap of the literary works and present the current research within the discipline.Obtain the customer sales transaction dataset, in order to apply the data mining algorithms.establish on the literature review, select the attach data mining approach to pre- attend to the dataset and to implement the algorithms on the pre-processed data.Analyse the results obtained from the data mining algorithms and propose recommendations to provide customised service.Draw conclusions, discuss the limitations of this research and suggest the areas of future research.Research admittanceThis research follows the quantitative methodology by obtaining the dataset and analysing the data with data mining tools. The dataset for abstract was obtained from ABC retail supermarket store, Canada, which was available online (http//www.statsci.org/datasets.html). The data required for this work out is selected and loaded onto data mining tools SPSS (Statistical Package for the Social Sciences) and Weka, the tools selected for this research to mine the data. The data mining algorithms that are selected for this study are k-means algorithm for flock and Apriori algorithm for familiarity rule mining, the sympathy behind the choice of these algorithms is justified in the literature review. These algorithms are implemented on the dataset with SPSS and Weka. The results obtained from these algorithms needs to be justified with the help of charts, tables and graphs. Microsoft Excel is used to plot the charts, tables and graphs. Finally, the recommendations are made based on the abstract of results.Disse rtation OutlineThis chapter presents the essence of this dissertation, set off the aim and objectives of this research. The rest of this dissertation is structured as followsChapter 2 provides a comprehensive literature review of different aspects relating to the research topic under study.Chapter 3 discuss in detail about the research methods and the data digest techniques followed, in order to achieve the aim of this research.Chapter 4 presents the psychoanalysis of the results obtained from the exertion of data mining algorithms on the data and provides recommendations.Chapter 5 summarises the total project and gives insights on limitations of this research and points out the areas of future research.CHAPTER 2 literary works REVIEWIntroductionThis chapter provides a critical review of literature addressing the application of data mining in retail supermarkets. It begins with an introduction to data mining, followed by its evolution and applications in todays intercourse channel world. Then explore the role of data mining in retail supermarkets to improve customer relationships, followed by a discussion about the typical data mining approach. It also discusses the techniques and algorithms implied in this project and the reason for their choice. information Mining An IntroductionThe word mining means extracting something useful or invaluable, such as mining gold from the earth (Lappas, 2007).The importance of mining is growing continuously, especially in the origin world. Data mining is a process of finding arouse patterns in databases for decision-making. It is one of the fast growing and most prominent fields, which can provide a significant advantage to an organization by exploiting the vast databases (Rygielski, Wang and Yen, 2002). finding patterns in business enterprise data is not new traditionally business analysts use statistical approach. The computer revolution and huge databases ranging from few Giga Bytes to Tera Bytes changed this scenario. For e.g. companies like Wal-Mart stores huge amount of sales transaction data, which can be used to analyze the customer buying patterns and make predictions(Bose and Mahapatra, 2001). Data warehousing engineering has enabled the companies to store huge amount of data from triune sources under a unified schema.Data mining has been considered to be a tool of business intelligence for knowledge discovery (Wang Wang, 2008). Many people consider data mining as Knowledge Discovery from Data (KDD), but it is actually a part of the larger process called knowledge discovery which describes the locomote that must(prenominal) be interpreted to secure the desired results (Han and Jiawei, 2006). Typical data mining process implicates various iterative mistreats the first step is the selection of appropriate data from a single database or multiple source systems followed by cleaning and preprocessing for consistency. The data is then analyzed to find patterns and correlations in the data. This approach compliments the other data analysis techniques like statistics, OLAP (On-line analytical processing) etc, (Bose and Mahapatra, 2001). Every organization follows a different data mining and modelling process to achieve their business imperatives.The Evolution of data miningIt all started with the need to store the data in computers and improve the price of admission to it for decision-making. Today the technology enables the users to access and navigate the real time data.At the beginning of 1960s, the data was collected for the purpose of making simple calculations to coif the business questions like the total average revenue for a specific period of time. In 1980s 1990s the usage of data warehouses to store data in a structured format emerged, policies regarding the format of data to be used in an organization were implemented (Therling.K, 1998). The data warehouses elongated to be multi-dimensional that facilitates the stakeholder to drilldown and navi gate through the data.Nowadays, online analytic tools assist to retrieve the data real-time. Now computers can query data from past to until the current. In novel years many technologies like statistics, AI (Artificial Intelligence) and machine learning have been evolving as core sectors in data mining field(Rygielski, Wang and Yen, 2002). So these technologies combine with relational database systems with data integration provide potential knowledge from the data.Data mining applicationsData mining can be implied in many fields depending on the aim of the company. Some of the main areas in todays business world where data mining is applied are as follows (Apte.C. et al, 2002) payTelecomMarketingWeb analysisInsuranceRetail care forData mining for CRM in retail supermarketsSwift (2001) defined CRM as an Enterprise approach to judgment and influencing customer behaviour through meaningful communications in order to improve customer acquisition, customer retention, customer loyalty, and customer profitability. According to research by the American management association It costs three to five times as much to acquire a new customer than to retain the existing one and is especially evident in services sector (Ennew Binks, 1996). Therefore it is very important to create a good relationship with the existing and new customer rather than expanding the customer base.A large number of companies are adopting various tools and strategies to enhance a more effective CRM, in order to gain an in-depth understanding about their customers. Data mining is a powerful new technique, which helps the companies to mine the patterns, trends and correlations in their large amounts of customer, product, or data, to drive improved customer relationships. It is one of the well-known tools given to customer relationship management (CRM) (Giudici Passerone, 2002). In the context of retail supermarket these patterns not only assists the retailers to offer high tonicity products and s ervice to their customers, but also helps them to understand the changes in customer needs.Data mining applications for CRM in retail supermarketsData mining improves customer relationship in retail supermarket, which is a wide area of research interest. Depending on the retailers objective, there are various application areas in which data mining can be applied to enhance customer relationship management. Some of the major data mining applications in retail supermarket, identified from literature are as followsCross-selling (Brijs et al 1999, Feng and Tsang, 1999)Product recommendation (Shih and Liu 2005, Li et al 2009)Customer behaviour modelling (Baydar.C 2003, Cadez, 2001)Shelf space allocation (Chen and Lin 2007, Chen et al 2006)Catalogue division (Ester et al,2004, Lin and Hong, 2006)Direct marketing (Bhattacharyya, 1999, Prinzie and Poel, 2005)Prize optimisation (Chen et al 2008, Kitts and Hetherington, 2005)THE DATA MINING PROCESSIvancsy Vajk, (2006), defined the three ma in stages involved in the data mining process which are (i) preprocessing, (ii) pattern discovery, (iii) pattern analysis/interpretation.PreprocessingFamili .A, (1997), defined data preprocessing as all the actions taken before the actual data analysis process starts. It is essentially a transformation T that transforms the raw real world data vectors Xik, to a set of new data vectors Yij.Yij = T (Xik)Such thatYij preserves the valuable information in Xik,Yij eliminates at least one of the problems in Xik andYij is more useful than Xik.In the above relationi=1 n where n = number of objects,j=1 m where m = number of features after preprocessing,k=1. . . l where l = number of delegates/features before preprocessing, and in command, m ? l.The most common data used for mining the purchase behaviour in retail supermarket is customer and transaction data (Giudici and Passerone, 2002).With a huge collection of customers sales transaction data available in the databases, it is inevitable to pre-process the data and extract the useful information from it. In the context of retail supermarkets Pinto et al, (2006), suggested four key lines in data preprocessing, they are data selection, data cleaning, data transformation, and data understanding.The first preprocessing task is data selection. here the subset of the data is identified on which pattern discovery is to be performed. This task is especially helpful in solving the problem of large amounts of data through precisely evaluating and categorizing the data into much smaller datasets. Computational requirements necessary for data analysis and manipulation are also hugely reduced by preprocessing large datasets through data selection techniques like crew or vector quantization (Famili .A, 1997).The second is data cleaning where grassroots operations include removing noise and use lacking data (Fayyad et al, 1996). some other issues regarding the data quality like errors and insufficient associates which may complicate data analysis are also addressed in data cleaning. In most cases missing ascribe values are replaced by attribute mean but traditionally, if more than 20% of attribute values are missing, the entire record is eliminated (Famili .A, 1997). To handle the outliers and noise data, techniques like binning (partitioning the sorted attribute values into bins), clustering and regression are applied.The next preprocessing task is data transformation. The application of each data mining algorithm requires the presence of data in a mathematically feasible format (Crone et al, 2006). Inaccuracies in the measurements of enter or in compensate feeding of data to the data mining algorithm could cause various problems. Since, operations such as normalization, aggregation, generalization and attribute construction are performed. Normalization deals with scaling the attribute value into a specific range, whereas aggregation and generalization refers to the summary of data in equipment casualty of numeric and nominal attributes. Attribute construction handles the replacement or increment of new attributes based on the existing attributes (Markov.Z and Larose.T.D, 2007). at a time issues regarding the data are solved and the data are prepared, understanding the nature of data would be useful in many ways. According to Famili .A, (1997), the majority of the data analysis tools have some limitations regarding the data characteristics therefore, it is important to recognize these characteristics for appropriate frame-up of data analysis process. He further pointed out that techniques like visual image and principal component analysis are useful for better understanding the data.Pattern discoveryFayyad et al, (1996), defined that core of the process is the application of specific data-mining methods for pattern discovery and extraction. Pattern discovery is the key stage of the process in this research, which is where the data is mined. Once the data is pre-processed , and the irrelevant information is eradicated, it is then used for mining, using data mining techniques to discover patterns. However, it is not the aim of this paper to describe all the available algorithms and techniques derived from these fields.This research focuses on two main data mining methods that to helps to mine the data and find patterns. They are Clustering and Association. The reason behind choosing these rules is justified below.ClusteringClustering can be defined as a technique to group unitedly a set of items having similar characteristics (Kuo et.al, 2002). In retail domain, cluster analysis is a common tool to segment the customers on the basis of their law of similarity on a chosen segmentation base or set of bases (Stewart.D.W and Girish.P., 1983). The actual choice for one or a combination of these bases largely depends on the business question under study (Wind, Y., 1978).Segmentation can be done on the basis of various variables/bases, such as 1) general or product-specific, and 2) observable or non-observable as classified by wedel M and Kamakura (2000).General bases for segmentation are independent of products, services or circumstances, whereas product-specific bases for segmentation are related to the product, the customer or the circumstances. Observable segmentation bases can be measured directly, whereas non-observable bases must be inferred. The combination of variety of segmentation bases is shown below.Twedt, D.W., (1967) as cited in Engel.J.F et.al, (1972), stated that the existence of huge amounts of transaction data in retail supermarket domain provides a great pulsing for segmentation on the basis of purchase frequencies. Segmentation based on this divides customers into groups on their intensity of buying a product(s), such as light, medium and heavy buyers. According to Brijs.T, (2002), if customers are classified by their purchase frequency, these segments could then be treated differently in terms of marketing c ommunication (pricing, promotion, product recommendation etc.) to achieve greater return of investment (ROI) and customer satisfaction. Therefore, in this research clustering is engaged to segment the customers into various clusters on the basis of their similarity in purchase frequency.Several algorithms have been proposed in the literature for clustering, such as ISODATA, CLARA, CLARANS, ScaleKM, P-CLUSTER, DBSCAN, Ejcluster, BIRCH and GRIDCLUS (Kanungo.T. et al, 2002). It is not the objective of this research to use all these algorithms for clustering. However, as discussed earlier, k-means clustering algorithm would be used to cluster and its justification is given below.k-Means Clustering algorithmic ruleThe K-means has been considered as one of the most effective algorithms in producing good clustering results for many practical(a) applications (Alsabti et.al, 1998). The main reason behind this is, when clustering is done for the purpose of data reduction, the goal is not t o find the lift out partitioning, but simply needs a reasonable consolidation of N data points into k clusters, and, if necessary, some efficient way to improve the quality of the initial partitioning (Faber, 1994). Therefore, k-means algorithm proves to be very effective in data reduction and produces a good clustering output.The k-means algorithm clusters the data that are similar into various clusters namely Cluster 0, Cluster 1 to Cluster n (Kanungo et.al, 2002). Provided a set of n data points in real d dimensional space (Rd) and an integer k, the aim is to determine k points in Rd, called the centers, so as to minimize the mean squared distance from each data point to its nearest center. This measure is often called as squared-error distortion (Jain Dubes, 1988).The plat below illustrates the standard k-means algorithm. It shows the results during two iterations in the partitioning of nine unconditional data points into two well separated clusters. Points in cluster 1 are s hown in red, points in cluster 2 are shown in black data points are denoted by open circles and reference points by filled circles. Clusters are indicated by dashed lines. The iteration converges pronto to the correct clustering even there was a bad initial choice of reference points.Lloyds algorithm is another popular version for K-means clustering which requires about the same amount of computation for a single pass through all the data points, or a single iteration, like the standard K-means algorithm (Faber, 1994). Lloyds algorithm is similar to standard k-means algorithm, except when the cluster centroids are chosen as reference points in subsequent partition the centroids are adjusted both during and after each partition. However, the k-means algorithm constantly updates the clusters and requires relatively less iterations than Lloyds algorithm, thus, k- means algorithm is considerably faster. This is the key reason that leads to the selection of k-means algorithm, since it can group the customers which have similar purchase frequency into different clusters in less iterations. However, Faber, (1994), pointed two major drawbacks to this algorithm. setoffly, it is computationally inefficient for large datasets. Secondly- although the algorithm will always produce the desired number of clusters, the centroids of these clusters may not be particularly representative of the data.Association RulesAssociation rule discovery was proposed to find all rules in a basket data to analyze how items purchased by customer in a shop are related (Gery Haddad, 2003). The rule refers to the discovery of attribute value associations that occur frequently together within a given data set (Han Kamber, 2001). It is typically used for market basket analysis to discover rules of the form x% of customers who buy item A and B, also buy item C (Zaiane, 2001) and is an implication of the form (A, B) C.Some of the key definitions drawn from literature that think of association rule technique are provided below (Agarwal, Imielinski and Swami, 1993).Itemset (i) Set of items that contain in a single transaction (e.g. milk, sugar, curd)Support (s) The plunk for expresses the function of transactions in the data that contain both the items in the radical and the consequent of the rule.Confidence (c) Confidence estimates the conditional probability of B given A, i.e. P (B A) and it can be calculated as Confidence (c) =s (A B) / s (A).Association rule discovery typically involves a two shaped sequential methodology (Brijs T., 2002).Finding frequent itemsetsThe first phase involves looking for for so-called frequent itemsets, i.e. itemsets for which the support in the database equals or exceeds the token(prenominal) support scepter set by the user. This is computationally the most complex phase because of the number of possible combinations of items that need to be tested for their support.Generating association rulesOnce all frequent itemsets are know n, the discovery of association rules is comparatively straightforward. The general scheme is that, if ABCD and AB are frequent itemsets, then it can be calculated whether the rule AB CD holds with sufficient confidence by computing the ratio confidence = s (ABCD) / s (AB). If the confidence of the rule equals or exceeds the minconf threshold set by the user, then it is a legitimate rule. For an itemset of size k, there are potentially 2k-2 confident rules.Association rules can help to discover frequently purchased combinations of products within a customer segment and provide customised service by promoting certain products or product combinations to the defined segments (Brijs T. et al, 2001). Therefore, in this research, frequent itemsets for each customer cluster will be generated and their combinations are compared to identify the differences in purchase behaviour to provide customised service.Traditionally, support and confidence are used in association rule discovery, but A ggarwal Yu, (1998), criticized this support-confidence framework for association rule discovery for the following main reasons.First of all, setting good values for the support and confidence parameters in association rule mining is critical. For example, setting the support threshold too low will lead to the generation of more frequent itemsets. But even if they would be statistically significant, their support is usually too low to have a significant influence.On the other hand, setting the support threshold too high increases the probability of finding insignificant relations and of missing some important associations between items.Further Agarwal Yu, (1998) Brin et al., (1998), as cited in Brijs.T,(2003), introduced the produce (also called interest) measure to overcome the disadvantage of confidence in not winning the baseline frequency of the consequent into account.Lift/Interest (l) Lift is computed as the confidence of the rule divided by the support of the right-hand-s ide (RHS). In other words, lift is the ratio of the probability that A and B occur together to the multiple of the two individual probabilities for A and B.Lift (l) = s (A B) / s (A).s (B)In order to perform predictive analysis, it is useful to discover interest patterns in the given dataset that serve as the base for future trends. The best and most popular algorithm used for this analysis is called the Apriori algorithm (Varde et.al, 2004).Apriori AlgorithmThe Apriori algorithm was proposed by Agarwal et.al, (1994) (Varde et.al, 2004). The algorithm finds frequent items in a given data set using the anti-monotone constraint (Petrucelli et.al, 1999), as cited in Varde et.al, 2004).It works under the principle that all subsets of a frequent itemset must also be frequent. In other words, if at least one subset of an itemset is not frequent, the itemset can neer be frequent anymore. This principle simplifies the discovery of frequent itemsets significantly because for some itemsets, it can be determined that they can never be frequent before checking their support against the data anymore. This is the key reason to select this algorithm, since the association rules for the items can be discovered more quickly and efficiently.Given a data set, the problem of association rule mining is to generate all rules that have support and confidence greater than a user-specified minimum support and minimum confidence respectively.Candidate sets having k items can be generated by joining large sets having k-1 items, and deleting those that contain a subset that is not large (where large refers to support above minimum support).Frequent sets of items with minimum support form the basis for deriving association rules with minimum confidence. For A B to hold with confidence C, C% of the transactions having A must also have B.Though the algorithm is very efficient in association rule mining, it has certain drawbacks, found by Margahny Shakour, (2006).After discovering the 4- frequent itemsets this algorithm needs extra data structure and methods to process, since the further itemsets can be obtained by different ways.This method is fast only while handling small data.There are several tools available for clustering and association rule mining such as AR miner, Clementine (SPSS), Enterprise Miner (SAS), Intelligent Miner (IBM), Decision Series (NeoVista). To mine association rules, weka is used, which is a collection of machine learning algorithms for data mining tasks and SPSS statistics 17.0 for clustering. weka is an open source software available online and very efficient in mining large datasets, where as SPSS statistics 17.0 is a statistical analysis encase available at Brunel university computer labs.Pattern AnalysisPattern analysis means understanding the results obtained by the algorithms and drawing conclusions. This is the last phase in data mining process, where the uninteresting rules or patterns from the set found in the pattern discover y phase are filtered out (Cooley et.al, 2000). The uninteresting patterns are filtered out by applying appropriate methodologies on the results and produce some interesting statistical patterns.SUMMARYThis chapter discussed the concept of data mining, its evolution and applications in todays business world. Then, it provided an overview regarding the role of data mining in retail supermarkets to improve customer relationships, followed by a discussion about the typical data mining approach. It also discussed the techniques and algorithms implied in this project and the reason for their choice. The following chapter will explain about the research approach followed in this dissertation.CHAPTER 3 RESEARCH admissionIntroductionThis chapter will discuss about the research approach employed in this project. It starts with a discussion about the research and literature review methods, followed by the data collection and justification of data mining approach on the data.Research MethodsT he research approach depends upon the objectives and aim of the study, as it assists the research worker to elicit appropriate responses. Boyatzis (1998) defines research methods as taxonomic procedure used for problem solving where, first data is collected based on the research question, hypotheses are stated, data analysis is carried out using appropriate techniques, results are interpreted and conclusions are derived. According to Hussey et al (1997), research methods can be distinguished in two types they are Qualitative and numeric approach. Oates (2006) says that, quantitative research method is the data or evidence on numbers whereas qualitative research method includes all non-numeric.In this research, quantitative research methodology is used. Quantitative study makes use of the numeric data that has been collected from a group of people interested in the guinea pig area which is then analysed and interprete
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment