Can Spam Be Useful? User-Defined Spam as Electronic Discourse

Can Spam Be Useful?
User-Defined Spam as Electronic DiscourseDr.Ambartsoumian’sPresentation atResearch and Networking Breakfast with a conversation topic on “Data Storage”Apr 24, 2019 FedEx Institute of Technology
E-Mail as the Most Prevalent Database Type and a Perfect Data Collection and Management Tool
High validity of data in collected messages -verifiablesources and destinations, which are also time-stamped and automatically and manually tagged with meta-data according toIETF standardsAll my data include the appropriate addresses and more than half of the messages (over 50,000 by now) mention senders’ namesTextual data are linguistic data and can be analyzed according to the existing linguistic theoriesBenefits: leads to clean lexicons, curated taxonomies, perfect for Design Science Research and NLPIssues:Conversion to the .csv format KILLED ALL THE TIMESTAMPS!NLP as a field is extremely disorganizedToo many people do silly things with text these days
Microsoft’s Extensible Storage Engine
Extensible Storage Engine (ESE), also known as JET Blue, is an ISAM (indexed sequential access method) data storage technology from Microsoft.ESE is the core ofMicrosoft Exchange Server, Active Directory, and Windows Search.It's also used by a number of Windows components including Windows Update client and Help and Support Center.Its purpose is to allow applications to store and retrieve data via indexed and sequential access.An ESE database looks like a single file to WindowsInternally the database is a collection of 2, 4, 8, 16, or 32 KB pages, arranged in a balanced B-tree structure.
Oracle’sInnoDBStorage Engine
InnoDBis a storage engine for the database management system MySQL.MySQL 5.5, December 2010, and later use it by default replacingMyISAM.It provides the standard ACID-compliant transaction features, along with foreign key support (Declarative Referential Integrity).Full text search indexes, since MySQL 5.6 (February 2013)and MariaDB 10.0Spatial operations, following theOpenGISstandardVirtual columns, only in MariaDB
Show Interest and They Will Interview Themselves For You
Professional advertisement as an agenda-setting (Baran, 2014) communication – a self-forming stream ofoperantsaimed at professional discourse communities (Porter, 1991)Theoretically defined by three main influencers:Skinner’s Operant Conditioning (1957) – professional shaming as punishments and a chance to condescend as a rewardFoucault’s (1991) episteme (every type of knowledge has its own discourse), plus power relationships – advertisers are too big to ignoreGlaser (1978), Glaser and Strauss’ (1967) “voices begging to be heard” (or demanding to be collected in my case)Definitely structured by multiple electronic communication protocols for decades (e.g. RFC 822, "Standard for the Format of ARPA Internet Text Messages“ dating from the mid-1970s) but mistakenly labeled “unstructured” by the SQL programmersMinimizes researcher’s bias
Result:a Reliable Data Supply Chain with a Theoretically Defined Research Boundary and Protocol Defined Data Structure and Storage
Addressable outbound electronic discourse seeks recipients, minimizes researcher’s bias in data collection, is designed for standardized data storageMinimizes Garbage-In:Stored in packets while in transit in the channel (Shannon, 1948)Assembled at destination with TCP-assured reliability (acknowledgment and retransmission)With email fetching protocols like POP3 and IMAP, messages are identified, and referenced by a unique ID (UID)E-mail is not structured for building relational tables, but is structured and densely tagged for linguistic analysis (e.g. application of lexical codes)Structured metadata provides sender-defined categories (makes category induction less biased) and allows for automatic market analysis by showing the sender name and domain
