Archive

Posts Tagged ‘rss’

RSS(XML) reading process

October 31, 2007 1 comment

$this->parser = xml_parser_create();
if(is_resource($this->parser)){
xml_set_object($this->parser, &$this);
xml_set_element_handler($this->parser, ‘feed_start_element’, ‘feed_end_element’);
xml_set_character_data_handler( $this->parser, ‘feed_cdata’ );
return true;
}
return false;

xml_parser_create
(PHP 3>= 3.0.6, PHP 4 , PHP 5)
xml_parser_create — Create an XML parser

Description

resource xml_parser_create ( [string encoding])

xml_parser_create()
creates a new XML parser and returns a resource handle referencing it to be used by the other XML functions.

xml_parser_create()는 파서를 생성하고,
관련된 xml 관련 함수의 리소스 핸들러를 사용.
이 옵션은 XML 입력시 파서의 문자 엔코딩을 알아내기 위한 옵션.
ISO-8859-1 기반인 UTF-8, US-ASCII 등을 사용할 수 있다.

The optional encoding specifies the character encoding of the XML input to be parsed.
Supported encodings are “ISO-8859-1″, which is also the default if no encoding is specified,
“UTF-8″ and “US-ASCII”.

is_resource
(PHP 4, PHP 5)
is_resource — Finds whether a variable is a resource

Description

bool is_resource ( mixed $var )
Finds whether the given variable is a resource.

is_resource()은
var 인자에 주어진 변수가 resource면 TRUE를,
아니라면 FALSE를 반환.

Parameters
var
The variable being evaluated.

Return Values
Returns TRUE if var is a resource, FALSE otherwise.

xml_set_object
(PHP 4 , PHP 5)
xml_set_object — Use XML Parser within an object

Description

void xml_set_object ( resource parser, object object)

This function allows to use parser inside object.
All callback functions could be set with xml_set_element_handler() etc and assumed to be methods of object.

xml_set_element_handler
(PHP 3>= 3.0.6, PHP 4 , PHP 5)
xml_set_element_handler — Set up start and end element handlers

Description

bool xml_set_element_handler
( resource parser, callback start_element_handler, callback end_element_handler)

Sets the element handler functions for the XML parser parser.
start_element_handler and end_element_handler are strings containing the names of functions that must exist when xml_parse() is called for parser.

The function named by start_element_handler must accept three parameters:
start_element_handler ( resource parser, string name, array attribs)

parser
The first parameter, parser, is a reference to the XML parser calling the handler.

name
The second parameter, name, contains the name of the element for which this handler is called.
If case-folding is in effect for this parser, the element name will be in uppercase letters.

attribs
The third parameter, attribs, contains an associative array with the element’s attributes (if any).
The keys of this array are the attribute names, the values are the attribute values.
Attribute names are case-folded on the same criteria as element names.
Attribute values are not case-folded.

The original order of the attributes can be retrieved by walking through attribs the normal way,
using each().
The first key in the array was the first attribute, and so on.

The function named by end_element_handler must accept two parameters:
end_element_handler ( resource parser, string name)

parser
The first parameter, parser, is a reference to the XML parser calling the handler.

name
The second parameter, name, contains the name of the element for which this handler is called.
If case-folding is in effect for this parser, the element name will be in uppercase letters.

If a handler function is set to an empty string, or FALSE, the handler in question is disabled.
TRUE is returned if the handlers are set up, FALSE if parser is not a parser.

xml_set_character_data_handler
(PHP 3>= 3.0.6, PHP 4 , PHP 5)
xml_set_character_data_handler — Set up character data handler

Description

bool xml_set_character_data_handler
( resource parser, callback handler)

Sets the character data handler function for the XML parser parser.
handler is a string containing the name of a function that must exist when xml_parse() is called for parser.

The function named by handler must accept two parameters:
handler ( resource parser, string data)

parser
The first parameter, parser, is a reference to the XML parser calling the handler.

data
The second parameter, data, contains the character data as a string.

If a handler function is set to an empty string, or FALSE, the handler in question is disabled.

Atom Syndication Format

September 15, 2007 Leave a comment

*Licence(from) : http://www.atomenabled.org/developers/syndication/

Elements of <feed>

Required feed elements

id
Identifies the feed using a universally unique and permanent URI. If you have a long-term, renewable lease on your Internet domain name, then you can feel free to use your website’s address.
<id>http://example.com/</id>

title
Contains a human readable title for the feed. Often the same as the title of the associated website. This value should not be blank.
<title>Example, Inc.</title>

updated
Indicates the last time the feed was modified in a significant way.
<updated>2003-12-13T18:30:02Z</updated>

Recommended feed elements

author
Names one author of the feed. A feed may have multiple author elements. A feed must contain at least one author element unless all of the entry elements contain at least one author element.
<author>
<name>John Doe</name>
<email>JohnDoe@example.com</email>
<uri>http://example.com/~johndoe</uri>
</author>

link
Identifies a related Web page. The type of relation is defined by the rel attribute. A feed is limited to one alternate per type and hreflang. A feed should contain a link back to the feed itself.
<link rel=”self” href=”/feed” />

Optional feed elements

category
Specifies a category that the feed belongs to. A feed may have multiple category elements.
<category term=”sports”/>

contributor
Names one contributor to the feed. An feed may have multiple contributor elements.
<contributor>
<name>Jane Doe</name>
</contributor>

generator
Identifies the software used to generate the feed, for debugging and other purposes. Both the uri and version attributes are optional.
<generator uri=”/myblog.php” version=”1.0″>
Example Toolkit
</generator>

icon
Identifies a small image which provides iconic visual identification for the feed. Icons should be square.
<icon>/icon.jpg</icon>

logo
Identifies a larger image which provides visual identification for the feed. Images should be twice as wide as they are tall.
<logo>/logo.jpg</logo>

rights
Conveys information about rights, e.g. copyrights, held in and over the feed.
<rights> © 2005 John Doe </rights>

subtitle
Contains a human-readable description or subtitle for the feed.
<subtitle>all your examples are belong to us</subtitle>


Elements of <entry>

Required Elements of <entry>

id
Identifies the entry using a universally unique and permanent URI. Suggestions on how to make a good id can be found here. Two entries in a feed can have the same value for id if they represent the same entry at different points in time.
<id>http://example.com/blog/1234</id>

title
Contains a human readable title for the entry. This value should not be blank.
<title>Atom-Powered Robots Run Amok</title>

updated
Indicates the last time the entry was modified in a significant way. This value need not change after a typo is fixed, only after a substantial modification. Generally, different entries in a feed will have different updated timestamps.
<updated>2003-12-13T18:30:02-05:00</updated>

Recommended elements of <entry>

author
Names one author of the entry. An entry may have multiple authors. An entry must contain at least one author element unless there is an author element in the enclosing feed, or there is an author element in the enclosed source element.
<author>
<name>John Doe</name>
</author>

content
Contains or links to the complete content of the entry. Content must be provided if there is no alternate link, and should be provided if there is no summary.
<content>complete story here</content>

link
Identifies a related Web page. The type of relation is defined by the rel attribute. An entry is limited to one alternate per type and hreflang. An entry must contain an alternate link if there is no content element.
<link rel=”alternate” href=”/blog/1234″/>

summary
Conveys a short summary, abstract, or excerpt of the entry. Summary should be provided if there either is no content provided for the entry, or that content is not inline (i.e., contains a src attribute), or if the content is encoded in base64.
<summary>Some text.</summary>

Optional elements of <entry>

category
Specifies a category that the entry belongs to. A entry may have multiple category elements.
<category term=”technology”/>

contributor
Names one contributor to the entry. An entry may have multiple contributor elements.
<contributor>
<name>Jane Doe</name>
</contributor>

published
Contains the time of the initial creation or first availability of the entry.
<published>2003-12-13T09:17:51-08:00</published>

source
If an entry is copied from one feed into another feed, then the source feed’s metadata (all child elements of feed other than the entry elements) should be preserved if the source feed contains any of the child elements author, contributor, rights, or category and those child elements are not present in the source entry.
<source>
<id>http://example.org/</id>
<title>Fourty-Two</title>
<updated>2003-12-13T18:30:02Z</updated>
<rights>© 2005 Example, Inc.</rights>
</source>

rights
Conveys information about rights, e.g. copyrights, held in and over the entry.
<rights type=”html”>
© 2005 John Doe
</rights>


Common Constructs

Category

<category> has one required attribute, term, and two optional attributes, scheme and label.
term identifies the category
scheme identifies the categorization scheme via a URI.
label provides a human-readable label for display

Content

<content> either contains, or links to, the complete content of the entry.

Link

<link> is patterned after html’s link element. It has one required attribute, href, and five optional attributes: rel, type, hreflang, title, and length.

href is the URI of the referenced resource (typically a Web page)

rel contains a single link relationship type. It can be a full URI, or one of the following predefined values (default=alternate):

  • alternate: an alternate representation of the entry or feed, for example a permalink to the html version of the entry, or the front page of the weblog.
  • enclosure: a related resource which is potentially large in size and might require special handling, for example an audio or video recording.
  • related: an document related to the entry or feed.
  • self: the feed itself.
  • via: the source of the information provided in the entry.

type indicates the media type of the resource.

hreflang indicates the language of the referenced resource.
title human readable information about the link, typically for display purposes.
length the length of the resource, in bytes.

Person

<author> and <contributor> describe a person, corporation, or similar entity.
It has one required element, name, and two optional elements: uri, email.
<name> conveys a human-readable name for the person.
<uri> contains a home page for the person.
<email> contains an email address for the person.

Text

<title>, <summary>, <content>, and <rights> contain human-readable text, usually in small quantities.
The type attribute determines how this information is encoded (default=”text”)

If type="text", then this element contains plain text with no entity escaped html.
<title type=”text”>AT&T bought by SBC!</title>

If type="html", then this element contains entity escaped html.
<title type=”html”>
AT&T bought <b>by SBC</b>!
</title>

If type="xhtml", then this element contains inline xhtml, wrapped in a div element.
<title type=”xhtml”>
<div xmlns=”http://www.w3.org/1999/xhtml”>
AT&T bought <b>by SBC</b>!
</div>
</title>

Tags:

Links about Atom format

September 15, 2007 Leave a comment
Tags: ,