INTRODUCTION TO ELECTRONIC COMMERCE ON INTERNET Business Growth on the Internet (#35) Monthly reports on network statistics FTP: nic.merit.edu DIRECTORY: nsfnet/statistics Up-to-date info on hosts FTP: nic.merit.edu SUBDIRECTORY: /nsfnet/statistics FILE: history No required registration of all users! Distribution of network users in 1994 (US): Commercial=51% Research=29% Government=9% Defense=7% Education=4% Top ten power users in 1994 (US): LSI Logic (6652 hosts) Bell Research (6158) Xerox (4765) Cadence (3573) Sterling Software (3554) Dell Computer (3530) Pyramid Technology (3148) Portal Communications (2946) Performance Systems Internationsl - PSI (2933) Honeywell (2603) ************************************************************************ Why Do Business on the Internet (#36) Benefits: (a) Global communications (b) Corporate logistics (c) Competitive advantage (d) Information sources (e) Customer support and feedback (f) Marketing (g) Collaboration and development Issues of importance: (a) Internet ethics (b) Globalization attitude ************************************************************************ Acceptable Use and Business on the Internet (#37) AUP - Acceptable Use Polices (AUP) - changing over time! All providers have their own AUPs. Acceptable: 1. Marketing using a dedicated server like Gopher, FTP, WWW, WAIS, and BBS, where the user seeks the information. 2. Dedicated Gophers for standard product information, price lists, document retrieval, announcements, etc. 3. Public databases for complex searching for info made available by Gopher, WWW, WAIS, FTP, etc. 4. Annonymous FTP of multimedia for storage and playback 5. Using mail distribution lists (Listserv) set up for the purpose of marketing 6. Threaded newsgroups such as Usenet, set up for business purposes 7. Very modest product announcements to appropriate Usenet newsgroups and mailing lists Marketing your products and services (#38) Unsolicited advertising - typical of humanspace, prohibited at cyberspace Cyberspace permits and supports information rich approaches to marketing Advertising is intruisive and contents free Marketing is active and value added Sending solicited inforamtion is fine Sending unsolicited information is not Models of marketing on Internet (and their possibilities): (a) Yellow pages (ftp archives, gopher servers, bbs, usenetnews, WWW, WAIS, email) (b) Billboard (.plan, plan.txt, and .profile files, signature blocks, greeting cards) (c) Virtual storefront (all above plus capacity to take orders and to deliver) Some of the marketing items easy to translate into Internet: (a) Product fliers (b) Product announcements (c) Product specification sheets (d) Pricing information (e) Catalogues (f) Demos (g) Free software (h) Customer support (i) Documentation and manuals (j) Multimedia descriptions (k) Market and consumer surveys (l) Product performance data (m) Job placement notices (n) Dialogues with customers Signature blocks: (a) Files .sign are short attachments to e-mail messages (6 or fewer lines) (b) Creativity needed! (c) Each mailer supports a different method of inserting (hw) Automatic replies: (a) Sent out automatically to any email delivered to a particular mailbox (b) The incoming message is saved into a file, to develop a prospects list Business on the Internet: Industry Examples (#39) (a) Internet Marketing Company started by the Internet Company Enables users to subscribe to Internet mailing lists which are atrgetted to specific product categories. Subscriber receives price lists, product information, ... This is an example of solicited advertising, which is permitted! (b) Internet Business Journal started by the Strangelove Internet Enterprises A gopher-like service: mstrange@fonorola.net (c) Net Advetiser maintained by InfoNet Project - a group of scientists/studs A mailing list which enables the entire Internet comunity to post private sales, rentals, services, announcements, ... (free of charge) netad@uds01.unix.st.it (d) Internet service providers (ISPs) Providing public and corporate access to Internet and its services EUnet is European-based service, via secure dial-up, good for travellers. (e) Data Base Architects of Alameda, California Using Internet to hold meetings (branch offices in London, Sydney, ...) (f) SunSITE by Sun Microsystems Computer Corporation of Mountain View, CA, US A virtual storefront (including Canada, UK, Germany, ...). ***** Frontiers of Electronic Commerce Reference: Kalakota, R., Whinston, A., "Frontiers of Electronic Commerce," Addison-Wesley Publishing Company, Reading, Massachusetts, USA, 1996. Definition: A modern business methodology to cut costs and improve quality using Internet; also, selling and buying, in the narrow sense, as well as support (search, retreive, etc...) for decision making. E-commerce applications: (a) Supply chain management (b) Video on demand (c) Remote banking (d) Procurement and purchasing (e) On line marketing and advertising (f) Home shopping Generic framework: (a) Technical standards (networking, multimedia, security, applications, ...) (b) Legal and privacy issues Elements of e-commerce: (a) Consumer devices (PC, phone, printer, etc...) (b) Network service provider (c) Information servers (video servers, corporate servers, e-publishing, ...) Basic architecture: Client-Server (a) Quantity: E.g., a 90-minute video may consume over 100 GB of storage (b) Quality: E.g., multimedia processing may require SMP/DSM/... Architecture of e-commerce: (a) Application services (customer-to-business, business-to-business,...) (b) Data management (order and payment processing, ...) (c) Interface layer (software agents, directory support functions, ...) (d) Secure messaging (secure transfer, client-server security, ...) (e) Middleware services (documents, HTML, ...) (f) Network infrastructure (wireless radio, cellular phone, ATM, ...) Electronic payment systems - electronic funds transfer (EFT): (a) Banking and financial payments Large scale (bank-to-bank transfers) Small scale (teller machines and cache dispensers) Home banking (bill payment) (b) Retail payments Credit cards (Visa or MasterCard) Debit cards (J.C. Penney Card) Charge cards (EmEx) (c) On-line electronic commerce payments Token based (electronic cach, electronic checks, smart cards, ...) Credit card based (encrypted credit cards, third party authorization nos) Token based systems In their present form, none of the banking and retail payment methods are completely adequate for the consumer oriented e-commerce environment; Their deficiency is their assumption that: (a) the parties will at some time be at each other's physical presence, or (b) there will be a sufficient delay in the payment process for fraud, overdrafts, and other undesirebles to be detected/corrected. Consequently, existing methods are modified, and new methods are being developed, like electronic tokens in the form of electronic cach or checks, which are backed by the bank. Three types: (a) Electronic-cach or real-time Transactions are settled with the exchange of electronic currency (b) Debit or prepayed Users pay in advance fo rthe privilege of getting information. Examples: smart cards or electronic purses that store electronic money (c) Credit or postpayed Server autenticates the customers and (before the purchase) verifies with the bank if funds are adequate Examples: credit cards and electronic checks. Each types triggers a diferent approach to: (a) The nature of the transaction (procedures) (b) The means of settlement used (tokens must be backed by bankware) (c) Approach to security, anonymity, and autentication (encription) (d) Risk (who assumes what kind of risk at what time) ELECTRONIC CASH It must have the following four properties: (a) monetary value (backed by either cach or a bank-authorized value (b) interoperability (exchangeable as payment for other e-cash or paper-cash) (c) retrievability (storable - maybe at a dedicated device - and retrievable) (d) security (not copyable while being exchanged) It is based on cryptographic systems called "digital signatures." This method involves a pair of numeric keys that work in tandem: one for locking (encoding) and the other for unlocking (decoding). The encoding key is kept private; the decoding key is made public. All customers (buyers and sellers) are supplied the bank's public key. If decoding by a customer yields a recognizable message - all OK. Digital sugnatures are as secure as the mathematics involved. Purchasing e-cache from "currency servers:" (a) Establishment of an account (b) Maintaining enough money in the account, to back purchases Inter-currency payments imply the existence of inter-bank relationships! In essence, e-cash is a pair [random#,value#]. Before each transactions, one first obtains the bank signature, which is the proof that the bank will back it with real cash. (like checking the note number of each note, before using it; bank's signature is like watermark in paper currency). For more information: the First Digital Bank on Internet (DigiCash) Important: user remains anonymous (can by illegal products) Mathematical procedure: (a) The customer's software chooses a blinding factor, R, independently and uniformly, at random, and presents the bank with with (XR)**E(modPQ), where X is the number of the note to be signed, and E is the bank's public key. (b) The bank signs it, using the bank's private key D, as follows (XR**E)**D=RX**D(modPQ) (c) On receiving the currency, the customer divides out the blinding factor (RX**D)/R=X**D(modPQ) (d) The customer stores X**D, the signed note that is used to pay for the purchase of products or services. Since R is random, the bank can not determine X, and thus can not connect the signing with the subsequent payment. Once the tokens are purchased, each e-cash software on the customer PC stores digital money undersigned by the bank. The user can spend digital money at any shop accepting e-cash, without having to open an account there first. As soon as the customer wants to buy something, software collects the necessary amount of e-cash. SMART CARDS Card enhanced with a microprocessor chip which holds more info than the traditional magnetic stripe (by the year 2000, more than 50% of cards will include a microprocessor). Two types of smart cards: (a) Relationship-based smart credit cards Access to numerous services, based on a relationship with an institution (b) Electronic purses Vending machine compatible replacement for money CREDIT CARD BASED ELECTRONIC PAYMENT SYSTEMS Business as usual: If consumers want to purchase a product or service, they simply send their credit card details to the product/service provider, and the credit card organization handles the payment. In on-line networks, three basic categories: (a) Using plain credit card details (b) Using encripted credit card details (c) Using third party veification Issues of importance: (a) Computations/communications infrastructure (EDI=ElectronicDataInterchange) (b) Risk management (mistakes, fraud, privacy, credit decisions, ...) EDI It takes a manually prepared form describing the business transaction, translates it into an electronic form, transmits it, and processes it on the receiving end. Essential issue - procedures (plus legality, security, and privacy) One of the side products - MIME (Multipurpose Internet Mail Extensions) as well as many other standards MIME Enables users to create and read e-mail messages containing the following: (a) Character sets other than ASCII (b) Math and other special symbols (c) Graphics images (d) Audio files and sounds (e) Binary files (postscript and compressed) The MIME header fields: (a) A MIME-version header (labels a message as MIME-conformant) This allows the MIME mail user agents to process the message appropritely (b) A content-type header field (specifies data types within the message) Various image and audio types supported (c) A content-transfer-encoding header field (specifies the encoding used) Important for getting a message throu the given transport system (I-way) (d) Two optional heared fields (content-ID and content-description) Serving to label and identify the data in the message MIME UA: Kernel of the system - The MIME-capable mail user agent Parsing and dispatching before viewing! Advantages and disadvantages!!! Standardization and EDI ANSI X.12 (American National Standards Institute) EDIFACT (United Nations EDI for Administration, Commerce, and Trade X.12 specifies procedures for (a) Order placement and processing (b) shipping and receiving (c) invoicing and payment Examples of X.12 transactions: (a) Vendor registration (form #838) (b) Request for quotation (840) (c) Response to request for quotation (843) (d) Purchase order or delivery order (850) (e) Purchase order acknowledgement (855) Today, ANSI and EDIFACT working towards compatibility! Internet-based EDI Factors that make Internet useful for EDI: (a) Flat pricing (b) Cheap access (c) Common mail standards (d) Security Intraorganizational electronic commerce Competitive advantage through better: (a) process automation (b) work-flow management (c) customization (time-to-market and flexibility-in-operations) (d) supply chain management, etc... New forms of organizational structure: (a) Virtual offices (b) Virtual or network organizational structure Closely coupled upstream with suppliers, downstream with customers (c) Electronic brokerage to increase the efficiency of internal markets; in essence, multiple services provided by a single interface with a single point of accountability on an order-by-order basis (example: Peapod) (d) Data and information warehouses with decision support systems and information filtering agents New trends: (a) VRML - Virtual Reality Modeling Language for WWW (b) ETET - Electronic Training and Education Tools (c) Electronic Publishing and Reinforcement of Digital Copyright No downloading No electronic storage No copies or distribution, even internally No copies to third parties Specific limitations on type of use (d) IASA - Intelligent Autonomous Software Agent SOFTWARE AGENTS The existing tool-based model = the "do what I say" model (a) User initiates actions which are passively facilitated by software (b) Local interraction The new agent-based model = the "do what I imply" model (a) Mimicing the role of a highly competent secretary (b) Global interraction Types of software agents: (a) Static or computer-bound (like office-bound worker) (b) Dynamic or network-wide (like mobile field workers) Example of a static agent: (a) Mail agent responsible for automatic reply (b) Doing embedded knowledge to assist in filtering and processing the volume Example of a dynamic agent: (a) Runing on a remote node and periodically reporting to the home node (b) Doing hours or days after being unleashed Issues: (a) Working based on directives Selling a stock after certain level is reached (b) Cooperating with other agents Setting a meeting time, and rescheduling when a higher level agent comes Why software agents: (a) Managing the information overhead (filtering and sorting the input data) (b) Decision support (expert systems) (c) Repetitive activity (labor cost contributes over 50% to final price) (d) Mundane personal activity (booking air tickets) (d) Search and retreival (to create an output data stream) (e) Domain experts (making the costly expertise widely available) Properties of mobile software agents: (a) Programmability Must be programmed or instructed in some manner (b) Safety Remote hosts must be sure thatagent will create no harm of any kind; the greatest concern (viruses are kinds of agents) (c) Resourcescarefulness As an owner, you must be sure that your agent will not exceed the budget; as a host, you must be sure that the visitor will not overuse resources (d) Navigationability The agent must be able to find the needed resources (e) Privacy The agent's internal program and state should not be visible to others (f) Communicationability The aggent must be able to communicate with owner even if network down What makes an agent to be intelligent: (a) Agent independency (b) Agent learning (c) Agent cooperation (d) Agent reasoning (rule-based, knowledge-based, ...) (e) Smart interface Components of a software agent: (a) Owner Parent process name or master agent name (b) Author Templates for customization or contacts for consulting (c) Lifetime Death condition (d) Account Billing related information or links to owner accounts (e) Goal Measures of success (f) Subject Description of the goal's attributes (g) Background Supporting information in the form understandable by the agent Launching software agents: (a) A request is placed to a network resource (e.g., node) (b) A permission is granted (e.g., access) Launching technologies: (a) Synchronous remote procedure call - using rewquest-reply cycle (b) Assynchronous remote programming - using message-oriented system agents (c) Database middleware - using protocols for access to relational databases Issues: (a) Controlling an agent on the fly (b) Protecting an agent on the fly Agent oriented languages: (a) Telescript (postscript) (b) Safe (active email) (c) Java (called from an HTML document) Main categories of software agents: (a) Event monitors (on condition triggers) (b) Work flow assistants (on demand advisers) (c) Internet data gathering and retrieval agents (on line searchers) Issues of importance: (a) Broadband/static communications and computing (hw) (b) Wireless/mobile communications and computing (hw) ACTIVE DOCUMENT ARCHITECTURE (ADA) Enabled by advances in broadband and wireless technologies! Enabled by the move from PC to PG (personal gateway to the network) ADA integrates multimedia information from different sources on the network in a disciplined manner (mixing the applets) Three elements of ADA: (a) Single process local containers/workspaces (b) Shared containers/workspaces (c) Network distributed objects Approaches to ADA: (a) Single platfrom integration (b) Multiple platform integration (c) Object-to-object calling mechanism Microsoft dynamic data exchenge IBM system object manager CORBA - Common Object Request Broker Architecture Essence: A specification of a common messaging standard for distributed objects Four key elements of the CORBA: (a) Object request broker A language which insulates the client form the comm mechanisms used (b) Object services Higher level services - transactions, concurrency control, licencing... (c) Common facilities Rules of interaction in four domains: User interface Information management System management Task management (d) Application objects Components of specific end-user applications Conclusion: CORBA is universal plumbing needed to promote e-commerce on I-way SELECTED PAPERS ABOUT INTELLIGENT AGENTS ON INTERNET Chen, H., Chung, Y.-M., Ramsey, M., Yank C., Ma, P.-C., Yen, J., "Intelligent Spider for Internet Searching," Proceedings of the HICSS-97, Maui, Hawai'i, USA, pp. 178-188. This paper introduces a new interactive genetic search algorithm, which is better than traditional genetic search without on-line adjustments (the worst case of which is the best-search algorithm) The number of home pages is doubled every 6 months! Consequently, searching is a challenge!!! Intelligent searching agents are called "spiders." Major problems: (a) Information overload (b) Vocabulary differences Main information retrieval mechanisms: (a) keyword search (Lycos at CMU and Yahoo at Stanford) (b) hypertext browsing (Mosaic and Netscape) Two main approaches to Internet searching: (a) client-based searching spider (b) on-line database indexing and searching Client-based searching spiders: (a) TueMosaic based on the best first search (b) TueMosaic v2.42 based on the fish search algorithm (c) WebCrawler based on an improved fish search algorithm The Best First Search elements: (a) Current homepage (one or a set) (b) User specified set of keywords (c) Depth and width of search for links contained in the current homepage The Fish Search - a modification of the Best First Search: (a) Each URL corresponds to a fish (b) After the document is retrieved, fish spawns children (URLs) (c) These URLs are "produced" only if relevant (not unconditionally) Drawbacks of Best First and Fish Search: (a) Potentially relevant homepages which do not connect with the current one are unaccessible! (b) The search is exponential, with the increase of depth and width. The Crawler Search - a modification of the Fish Search: (a) Search initiated using index (b) Links followed in an intelligent order: Relevance of a link is evaluated using the anchor test Anchor test measures similarity between anchor text and user query Anchor text are the words describing the link to another document Anchor text is a small subset of the document Search speed versus search quality Essence: If weak links avaided - more strong links in unit of time! (c) Used by America Online since January 1995 On-line database indexing and searching (a) Entire WWW documents are retrieved and stored in the host server (b) All relevant information is indexed on the host server (c) This creates a server-based replica of all information on the WWW (d) Index is used as a search key Examples: (a) WWWW - World Wide Web Work (b) AilWeb (c) Harvest Information Discovery and Access System (d) University of Arizona WWW Lab (e) Lycos (f) Excite (g) Yahoo (h) Alta Vista Architecture of an Intelligent Spider (5 components): (a) Requests and control (b) Graphical user interface (c) Search engine (d) Home page fetching (e) Indexing source Requests and Control (#1): (a) Users submit queries with information such as 1. Starting URL(s) 2. Keywords 3. Number of URLs expected to return 4. Category of the searching space (b) When a query is submitted the appropriate searching space is invoked in the available databse Graphical User Interface (#2): (a) A link between the submitted query and the searching engine (b) Important that users can view intermediate results Search engine (#3): (a) Genetic algorithm (b) Simulated annealing Homepage fetching (#4): (a) Public fetching machines (Lynx and HtmlGobble) (b) Custom fetching machines (Arizona and Serbia) Indexing score (#5): (a) Major goal of indexing is to identify the contents of a WWW document (b) Major procedures of indexing are 1. Word identification (ignored: case and punctuation) 2. Word filter (extracted: common function, pure, and general words) Comparing the similarity of homepages - Jaccard's score: (a) A homepage with a higher Jaccard score has a higher fitness with the input homepage (b) Score computed from links or from indexing SCORE FROM LINKS: Homepages are x and y Their links are X={x1,x2,...} and Y={y1,y2,...} Jaccard's score between x and y is equal to J=(X p Y)/(X u Y). If X=Y then J=1 If X<>Y then J=0 SCORE FROM INDEXING: #1. Total number of homepages is counted N #2. Terms of a homepage are identified set t #3. Total number of terms is counted L #3. The number of words in term j is calculated w(j) #4. Term frequency (number of occurances of term j in homepage x) tf(xj) #5. Homepage frequency (number of homepages in set N where term j occurs) df(j) #6. Combined weight of term j in homepage x d(xj) d(xj) = tf(xj) * log[w(j)*(N/df(j)] J=A/(A+B+C) A = SIGMA [from j=1 to L] of [d(xj)*d(yj)] B = SIGMA of [d(xj)**2] C = SIGMA of [d(yj)**2] DETAILS OF THE BEST FIRST SEARCH: Essence: (a) Looking for the best homepage in each itaration (b) The number of iterations is equal to the number of required homepages Algorithm: (a) Initialization and input #1. Initialize k to 1 #2. Obtain the initial set of homepages from the user(s) #3. Homepages from the initial set are fetched #4. Linked homepages of input set are saved in H={h1,h2,...} (b) Determining the best homepage in H #1. Determine the maximal J for all elements of set H #2. Score computed as: Jlinks(hi)=(1/N)*SIGMA(j=1 to N)of[Jlinks(inputj,hi)] Jindex(hi)=(1/N)*SIGMA(j=1 to N)of[Jindex(inputj,hi)] J(hi)=(1/2)*[Jlinks(hi)*Jindex(hi)] #3. The homepage in H with the largest Jaccard score is saved as: (c) Fetch the best homepage #1. Fetch the best homepage from OUTPUTk #2. Increase k by 1 (d) Repeat until all output homepages obtained DEATAILS OF THE GENETIC ALGORITHM (a) Initialization of the search space (b) Crossover and mutation (SWISH) (c) Stochastic selection based on fitness (d) Converging Reference: Goldberg, D.E., "Genetic Algorithms in Search, Optimization, and Machine Learning," Addison-Wesley, Reading, Massachusetts, 1989. Note: Applied in several virtual proxy servers! TRADITIONAL PROXY SERVERS An add-on to WWW servers to provide cacheing and security (a part of an WWW server) VIRTUAL PROXY SERVERS A middle layer between WWW servers browsers of clients, responsible not only for cacheing and security, but also for search, indexing, filtering, profiling, agenting, ... Reference: Wu, S., Liao, C.-C., "Virtual Proxy Servers for WWW and Intelligent Agents on the Internet," Proceedings of the HICSS-97, Maui, Hawai'i, USA, January 1997, pp. 200-209. Note: All elements of the puzzle (e-commerce) are in place; payment is crown! STATE OF THE ART IN ELECTRONIC PAYMENT SYSTEMS Reference: Asokan, N., Janson, P.A., Steiner, M., Waidner, M., "The State of the Art in Electronic Payment Systems," IEEE COMPUTER, September 1997, pp. 28-35. (http://www.semper/org/sirene/outsideworld/ecommerce.html) Note: Electronic funds transfer over financial networks in reasonably secure, but securing payments over Internet is a challenge! CASE STUDY IN ELECTRONIC PAYMENT SYSTEMS Reference: Tenenbaum, J.M., Chowdhry, T.S., Hughes, K., "Eco System: An Internet Commerce Architecture," IEEE COMPUTER, May 1997, pp. 48-55. (http://www.commerce.net) Note: A framework of frameworks is proposed for robust e-commerce!